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Preface 


ASIACRYPT 2009, the 15th International Conference on the Theory and Appli- 
cation of Cryptology and Information Security was held in Tokyo, Japan, during 
December 6-10, 2009. The conference was sponsored by the International As- 
sociation for Cryptologic Research (IACR) in cooperation with the Technical 
Group on Information Security (ISEC) of the Institute of Electronics, Informa- 
tion and Communication Engineers (IEICE). ASIACRYPT 2009 was chaired by 
Eiji Okamoto and I had the honor of serving as the Program Chair. 

The conference received 300 submissions from which two papers were with- 
drawn. Each paper was assigned at least three reviewers, and papers co-authored 
by Program Committee members were assigned at least five reviewers. We spent 
eight weeks for the review process, which consisted of two stages. In the first four- 
week stage, each Program Committee member individually read and evaluated 
assigned papers (individual review phase), and in the second four- week stage, 
the papers were scrutinized with an extensive discussion (discussion phase). The 
review reports and discussion comments reached a total of 50,000 lines. 

Finally, the Program Committee decided to accepted 42 submissions, of which 
two submissions were merged into one paper. As a result, 41 presentations were 
given at the conference. The authors of the accepted papers had four weeks to 
prepare final versions for these proceedings. These revised papers were not sub- 
ject to editorial review and the authors bear full responsibility for their contents. 
Unfortunately there were a number of good papers that could not be included 
in the program due to this year’s tough competition. 

Tatsuaki Okamoto delivered the 2009 IACR Distinguished Lecture. The Pro- 
gram Committee decided to give the Best Paper Award of ASIACRYPT 2009 to 
the following paper: “Improved Generic Algorithms for 3-Collisions” by Antoine 
Joux and Stefan Lucks. They received an invitation to submit a full version to 
the Journal of Cryptology. In addition to the papers included in this volume, 
the conference also featured a rump session, a forum for short and entertaining 
presentations on recent works of both a technical and non-technical nature. 

There are many people who contributed to the success of ASIACRYPT 2009. 
First I would like to thank all authors for submitting their papers to the con- 
ference. I am deeply grateful to the Program Committee for giving their time, 
expertise and enthusiasm in order to ensure that each paper received a thorough 
and fair review. Thanks also to 303 external reviewers, listed on the following 
pages, for contributing their time and expertise. Finally, I would like to thank 
Shai Halevi for maintaining his excellent Web Submission and Review Software. 
Without this system, which covers all processes from paper submission to prepa- 
ration of the proceedings, I could not have handled 300 papers so smoothly. 
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Mitsuru Matsui 
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Related-Key Cryptanalysis of the Full AES-192 
and AES-256 


Alex Biryukov and Dmitry Khovratovich 
University of Luxembourg 


Abstract. In this paper we present two related-key attacks on the full 
AES. For AES-256 we show the first key recovery attack that works for 
all the keys and has 2 99 ' 5 time and data complexity, while the recent 
attack by Biryukov-Khovratovich-Nikolic works for a weak key class and 
has much higher complexity. The second attack is the first cryptanalysis 
of the full AES-192. Both our attacks are boomerang attacks, which are 
based on the recent idea of finding local collisions in block ciphers and 
enhanced with the boomerang switching techniques to gain free rounds 
in the middle. 

Keywords: AES, related-key attack, boomerang attack. 

The extended version of this paper is available at 
http : //eprint . iacr . org/2009/317 .pdf 


1 Introduction 

The Advanced Encryption Standard (AES) jS] — a 128-bit block cipher, is one 
of the most popular ciphers in the world and is widely used for both commercial 
and government purposes. It has three variants which offer different security 
levels based on the length of the secret key: 128, 192, 256-bits. Since it became 
a standard in 2001 [I], the progress in its cryptanalysis has been very slow. The 
best results until 2009 were attacks on 7-round AES-128 jlOlllj , 10-round AES- 
192 |5lldj . 10-round AES-256 jiollMj out of 10, 12 and 14 rounds respectively. 
The two last results are in the related-key scenario. 

Only recently there was announced a first attack on the full AES-256 jS|. The 
authors showed a related-key attack which works with complexity 2 96 for one 
out of every 2 35 keys. They have also shown practical attacks on AES-256 (see 
also [Zj) in the chosen key scenario, which demonstrates that AES-256 can not 
serve as a replacement for an ideal cipher in theoretically sound constructions 
such as Davies-Meyer mode. 

In this paper we improve these results and present the first related-key attack 
on AES-256 that works for all the keys and has a better complexity (2 99 - 5 data 
and time). We also develop the first related key attack on the full AES-192. 
In both attacks we minimize the number of active S-boxes in the key-schedule 
(which caused the previous attack on AES-256 to work only for a fraction of all 
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Table 1. Best attacks on AES-192 and AES-256 


Attack 

Rounds 

# keys 

|Data|Time 

Memory 

Source 

192 

Partial sums 

8 

1 

2 127.9 

2 188 

? 

urn 

Related-key rectangle 

10 

64 

2 124 

2 183 

? 

men 

Related-key 
amplified boomerang 

12 

4 

2 123 

2 176 

2 182 

Sec. E| 

256 

Partial sums 

9 

256 

2 85 

2 226 

2 32 

EH 

Related-key rectangle 

10 

64 

2 114 

2 173 

? 

HJH1 

Related-key differential 

14 

2 35 

2 131 

2 131 

2 85 


Related-key boomerang 

14 

4 

2 99.5 

2 99.5 

2 77 

Sec. E| 


keys) by using a boomerang attack [El enhanced with boomerang switching tech- 
niques. We find our boomerang differentials by searching for local collisions |8I0I 
in a cipher. The complexities of our attacks and a comparison with the best 
previous attacks are given in Table [D 

This paper is structured as follows: In Section 0 we develop the idea of local 
collisions in the cipher and show how to construct optimal related-key differen- 
tials for AES-192 and AES-256 . In Section 0| we briefly explain the idea of a 
boomerang and an amplified boomerang attack. In Sections 0 and EJ we describe 
an attack on AES-256 and AES-192, respectively. 

2 AES Description and Notation 

We expect that most of our readers are familiar with the description of AES and 
thus point out only the main features of AES-256 that are crucial for our attack. 

AES rounds are numbered from 1 to 14 (12 for AES-192). We denote the i-th 
192-bit subkey (do not confuse with a 128-bit round key) by K l , i.e. the first 
(whitening) subkey is the first four columns of K°. The last subkey is K 7 in 
AES-256 and K 8 in AES-192. The difference in K l is denoted by AK l . Bytes 
of a subkey are denoted by k\ j, where i,j stand for the row and column index, 
respectively, in the standard matrix representation of AES, and l stands for the 
number of the subkey. Bytes of the plaintext are denoted by pij, and bytes of the 
internal state after the SubBytes transformation in round r are denoted by a-j, 
with A r depicting the whole state. Let us also denote by • byte in position 
(i, j) after the r-th application of MixColumns. 

Features of AES-256. AES-256 has 14 rounds and a 256-bit key, which is two 
times larger than the internal state. Thus the key schedule consists of only 7 
rounds. One key schedule round consists of the following transformations: 
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k l + Q l <- S{k\ +lf7 ) © k l ifi © C\ 0 < i < 3; 

kgj 1 <- ® k\j , 0 < i < 3, 1 < i < 3; 

k l $ <- S{k l i ^ 1 ) ® k[ 4 , 0 < i < 3; 

fejj 1 0<i<3, 5<i<7, 

where SQ stands for the S-box, and C l — for the round-dependent constant. 
Therefore, each round has 8 S-boxes. 

Features of AES-192. AES-192 has 12 rounds and a 192-bit key, which is 1.5 
times larger than the internal state. Thus the key schedule consists of 8 rounds. 
One key schedule round consists of the following transformations: 

k^o 1 ^ S(k l i+h5 )®k[ 0 ®C l , 0 < i < 3; 

k\y <- feg* l © k\ . , 0 < i < 3, 1 < j < 5. 

Notice that each round has only four S-boxes. 

3 Local Collisions in AES 

The notion of a local collision comes from 
the cryptanalysis of hash functions with 
one of the first applications by Chabaud 
and Joux jH]. The idea is to inject a dif- 
ference into the internal state, causing a 
disturbance, and then to correct it with 
the next injections. The resulting differ- 
ence pattern is spread out due to the mes- 
sage schedule causing more disturbances 
in other rounds. The goal is to have as 
few disturbances as possible in order to 
reduce the complexity of the attack. 

In the related-key scenario we are al- 
lowed to have difference in the key, and 
not only in the plaintext as in the pure 
differential cryptanalysis. However the 
attacker can not control the key itself and 
thus the attack should work for any key 
pair with a given difference. 

Local collisions in AES-256 are best understood on a one-round example (Fig.[IJ , 
which has one active S-box in the internal state, and five non-zero byte differences 
in the two consecutive subkeys. This differential holds with probability 2 -6 if we 
use an optimal differential for an S-box: 
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0x01 Su ^ t6S Oxlf; 



Due to the key schedule the differences spread to other subkeys thus forming 
the key schedule difference. The resulting key schedule difference can be viewed 
as a set of local collisions, where the expansion of the disturbance (also called 
disturbance vector) and the correction differences compensate each other. The 
probability of the full differential trail is then determined by the number of 
active S-boxes in the key-schedule and in the internal state. The latter is just 
the number of the non-zero bytes in the disturbance vector. 

Therefore, to construct an optimal trail we have to construct a minimal-weight 
disturbance expansion, which will become a part of the full key schedule differ- 
ence. For the AES key schedule, which is mostly linear, this task can be viewed 
as building a low-weight codeword of a linear code. Simultaneously, correction 
differences also form a codeword, and the key schedule difference codeword is 
the sum of the disturbance and the correction codewords. In the simplest trail 
the correction codeword is constructed from the former one by just shifting four 
columns to the right and applying the S-box-MixColumns transformation. 

An example of a good key-schedule pattern for AES-256 is depicted in FigureEl 
as a 4.5-round codeword. In the first four key-schedule rounds the disturbance 
codeword has only 9 active bytes (red cells in the picture), which is the lower 
bound. We want to avoid active S-boxes in the key schedule as long as possible, 
so we start with a single-byte difference in byte /cq 0 an d go backwards. Due to 
a slow diffusion in the AES key schedule the difference affects only one more 
byte per key schedule round. The correction (grey) column should be positioned 
four columns to the right, and propagates backwards in the same way. The last 
column in the first subkey is active, so all S-boxes of the first round are active 
as well, which causes an unknown difference in the first (green) column. This 
“alien” difference should be canceled by the plaintext difference. 



Key schedule 



Fig. 2. Full key schedule difference (4.5 key-schedule rounds) for AES-256 
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4 Related Key Boomerang and Amplified Boomerang 
Attacks 

In this section we describe two types of boomerang attacks in the related-key 
scenario. 

A basic boomerang distinguisher US! is applied to a cipher E K (-) which is 
considered as a composition of two sub-ciphers: Ek{-) = E 1 o E 0 . The first sub- 
cipher is supposed to have a differential a — > /?, and the second one to have a 
differential 7 — > S, with probabilities p and q, respectively. In the further text 
the differential trails of Eq and E\ are called upper and lower trails, respectively. 

In the boomerang attack a plaintext pair results in a quartet with probability 
p 2 q 2 . The amplified boomerang attack [T2] (also called rectangle attack 0) works 
in a chosen-plaintext scenario and constructs N 2 p 2 q 2 2~ n quartets of N plaintext 
pairs. We refer to jl fill ‘1\ for the full description of the attacks. 

In the original boomerang attack paper by Wagner [E! it was noted that 
the number of good ciphertext quartets is actually higher, since an attacker may 
consider other j3 and 7 (with the same a and 5). This observation can be applied 
to both types of boomerang attacks. As a result, the number Q of good quartets 
is expressed via amplified probabilities p and q as follows: 


Q=p 2 q 2 2~ n N 2 , 


where 



(1) 


4.1 Related- Key Attack Model 

The related-key attack model is a class of cryptanalytic attacks in which the 
attacker knows or chooses a relation between several keys and is given access to 
encryption/decryption functions with all these keys. The goal of the attacker is 
to find the actual secret keys. The relation between the keys can be an arbitrary 
bijective function R (or even a family of such functions) chosen in advance by 
the attacker (for a formal treatment of the general related key model see [2ll4j ) . 
In the simplest form of this attack, this relation is just a XOR with a constant: 
K 2 = K\ ® C, where the constant C is chosen by the attacker. This type of 
relation allows the attacker to trace the propagation of XOR differences induced 
by the key difference C through the key schedule of the cipher. However, more 
complex forms of this attack allow other (possibly non-linear) relations between 
the keys. For example, in some of the attacks described in this paper the attacker 
chooses a desired XOR relation in the second subkey, and then defines the implied 
relation between the actual keys as: K 2 = F~ 1 (F(Ki) ® C) = Rc(Ki) where 


A. Biryukov and D. Khovratovich 


F represents a single round of the AES-256 key schedule, and the constant C is 
chosen by the attacker 0 

Compared to other cryptanalytic attacks in which the attacker can manipulate 
only the plaintexts and/or the ciphertexts the choice of the relation between 
secret keys gives additional degree of freedom to the attacker. The downside of 
this freedom is that such attacks might be harder to mount in practice. Still, 
designers usually try to build “ideal” primitives which can be automatically used 
without further analysis in the widest possible set of applications, protocols, or 
modes of operation. Thus resistance to such attacks is an important design goal 
for block ciphers, and in fact it was one of the stated design goals of the Rijndael 
algorithm, which was selected as the Advanced Encryption Standard. 

In this paper we use boomerang attacks in the related-key scenario. In the 
following sections we denote the difference between subkeys in the upper trail 
by AK\ and in the lower part by VA’ . 

4.2 Boomerang Switch 

Here we analyze the transition from the sub- trail Eq to the sub-trail E \ , which 
we call the boomerang switch. We show that the attacker can gain 1-2 middle 
rounds for free due to a careful choice of the top and bottom differentials. The 
position of the switch is a tradeoff between the sub-trail probabilities, that should 
minimize the overall complexity of the distinguisher. Below we summarize the 
switching techniques that can be used in boomerang or amplified boomerang 
attacks on any block cipher. 

Ladder switch. By default, a cipher is decomposed into rounds. However, such 
decomposition may not be the best for the boomerang attack. We propose 
not only to further decompose the round into simple operations but also to 
exploit the existing parallelism in these operations. For example some bytes 
may be independently processed. In such case we can switch in one byte be- 
fore it is transformed and in another one after it is transformed, see Fig. 0 for 
an illustration. 

An example is our attack on AES-192. Let us look at the differential trails 
(see Fig. 0. There is one active S-box in round 7 of the lower trail in byte 
&o 2- O ri the other hand, the S-box in the same position is not active in the 
upper trail. If we would switch after ShiftRows in round 6, we would “pay” the 
probability in round 7 afterwards. However, we switch all the state except 60,2 
after MixColumns, and switch the remaining byte after the S-box application in 
round 7, where it is not active. We thus do not pay for this S-box. 

Feistel switch. Surprisingly, a Feistel round with an arbitrary function (e.g., an 
S-box) can be passed for free in the boomerang attack (this was first observed 
in the attack on cipher Khufu in HSj). Suppose the internal state (A, Y) is 

1 Note that due to low nonlinearity of AES-256 key schedule such subkey relation 

corresponds to a fixed XOR relation in 28 out of 32 bytes of the secret key, and a 

simple S-box relation in the four remaining bytes. 
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Fig. 3. The ladder switch in a toy three S-box block. A switch either before or after 
the S-box layer would cost probability, while the ladder does not. 


transformed to (Z = X ® f(Y), Y) at the end of E 0 . Suppose also that the E 0 
difference before this transformation is (Ax, Ay), and that the E\ difference 
after this transformation is ( Az,Ay ). 

As a result, variable Y in the four iterations of a boomerang quartet takes two 
values: Yq and To © Ay for some To- Then the / transformation is guaranteed 
to have the same output difference Af in the quartet. Therefore, the decryption 
phase of the boomerang creates the difference Ax in X at the end of Eq “for 
free”. This trick is used in the switch in the subkey in the attack on AES-192. 

S-box switch. This is similar to the Feistel switch, but costs probability only 
in one of the directions. Suppose that E 0 ends with an S-box Y <= S(X) with 
difference A If the output of an S-box in a cipher has difference A and if the same 
difference A comes from the lower trail, then propagation through this S-box is 
for free on one of the faces of the boomerang. Moreover, the other direction can 
use amplified probability since specific value of the difference A is not important 
for the switclu. 

5 Attack on AES-256 

In this section we present a related key boomerang attack on AES-256. 

5.1 The Trail 

The boomerang trail is depicted in Figure 0 and the actual values are listed in 
Tables 0andEI It consists of two similar 7-round trails: the first one covers rounds 
2 This type of switch was used in the original version of this paper, but is not needed 
now due to change in the trails. We describe it here for completeness, since it might 
be useful in other attacks. 
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Table 2. Key schedule difference in the AES-256 trail 


AK l 

0 

? 00 00 00 3e 3e 3e 3e 
? 01 01 01 ? 21 21 21 
? 00 00 00 If If If If 
? 00 00 00 If If If If 

1 

00 00 00 00 3e 00 3e 00 
00 01 00 01 21 00 21 00 
00 00 00 00 If 00 If 00 
00 00 00 00 If 00 If 00 

2 

00 00 00 00 3e 3e 00 00 
00 01 01 00 21 21 00 00 
00 00 00 00 If If 00 00 
00 00 00 00 If If 00 00 

3 

00 00 00 00 3e 00 00 00 

00 01 00 00 21 00 00 00 
00 00 00 00 If 00 00 00 
00 00 00 00 If 00 00 00 

4 

00 00 00 00 3e 3e 3e 3e 
00 01 01 01 ? ? ? ? 
00 00 00 00 1/ 1/ If If 
00 00 00 00 If If If If 



Vk' 

0 

??????? 00 
X X X X If If If 00 
? ? ? ? If If If 00 
? ? ? ? 21 21 21 00 

1 

? 01 ? 00 ? ? 00 00 
A 00 A 00 If If 00 00 
? 00 ? 00 If If 00 00 
? 00 ? 00 21 21 00 00 

2 

? ? 00 00 ? 00 00 00 
A A 00 00 If 00 00 00 
? ? 00 00 If 00 00 00 
? ? 00 00 21 00 00 00 

3 

? 01 01 01 3e 3e 3e 3e 
X 00 00 00 If If If If 
? 00 00 00 If If If If 
? 00 00 00 21 21 21 21 

4 

01 00 01 00 3e 00 3e 00 
00 00 00 00 If 00 If 00 
00 00 00 00 If 00 If 00 
00 00 00 00 21 00 21 00 

5 

01 01 00 00 3e 3e 00 00 
00 00 00 00 If If 00 00 
00 00 00 00 If If 00 00 
00 00 00 00 21 21 00 00 

6 

01 00 00 00 3e 00 00 00 
00 00 00 00 If 00 00 00 
00 00 00 00 If 00 00 00 
00 00 00 00 21 00 00 00 

7 

01 01 01 01 ? ? ? ? 
00 00 00 00 1/ 1/ If If 
00 00 00 00 If If If If 
00 00 00 00 21 21 21 21 




1-8, and the second one covers rounds 8-14. The trails differ in the position of 
the disturbance bytes: the row 1 in the upper trail, and the row 0 in the lower 
trail. This fact allows the Ladder switch. 

The switching state is the state A 9 (internal state after the SubBytes in round 
9) and a special key state Kg, which is the concatenation of the last four columns 
of K 3 and the first four columns of K 4 . Although there are active S-boxes in 
the first round of the key schedule, we do not impose conditions on them. As a 
result, the difference in column 0 of K° is unknown yet. 

Related Keys. We define the relation between four keys as follows (see also 
Figure EJ). For a secret key Ka, which the attacker tries to find, compute its 
second subkey K\ and apply the difference AK l to get a subkey K l n , from 
which the key Kb is computed. The relation between Ka and Kb is a constant 
XOR relation in 28 bytes out of 32 and is computed via a function k' i 0 = 
k i: o ffi S(k i+ i : r) ® S(ki + i t 7 © Cj+1,7), i=0,l,2,3, with constant Cj + 1,7 = Ak^ +17 
for the four remaining bytes. 

The switch into the keys Kc, Kb happens between the 3rd and the 4th sub- 
keys in order to avoid active S-boxes in the key-schedule using the Ladder switch 
idea described above. We compute subkeys K 3 and K 4 for both Ka and Kb- 
We add the difference VK 3 to K\ and compute the upper half (four columns) 
of Kq. Then we add the difference VK 4 to K\ and compute the lower half (four 
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Table 3. Non-zero internal state differences in the AES-256 trail 


? 00 00 00 

AP ■ ■ ■ ■ 

? 00 ? 00 

7 00 00 ? 

? 00 00 00 

AA 1 V ? W 
00 00 ? 00 
00 00 00 ? 

00 00 00 00 

3 00 If 00 If 
00 00 00 00 
00 00 00 00 

00 00 00 00 

5 00 1/ 1/ 00 
00 00 00 00 
00 00 00 00 

00 00 00 00 

7 00 1/ 00 00 
00 00 00 00 
00 00 00 00 

1/ 1/ If If 
7 00 00 00 00 
V 00 00 00 00 
00 00 00 00 

If 00 If 00 
9 00 00 00 00 
V 00 00 00 00 
00 00 00 00 

1/ If 00 00 

n 00 00 00 00 
V 00 00 00 00 
00 00 00 00 

If 00 00 00 
13 00 00 00 00 
V 00 00 00 00 
00 00 00 00 

00 00 00 00 
00 00 00 00 
° 00 00 00 00 
00 00 00 00 




Kc 

Ka 


K 3 K 4 

K° K 1 

1 1 1 

< : t 't i > 

VK 3 —0 ©— Vif 4 

, !U ! , 


! i ! 

K 2 K 3 K 4 K 5 

Kb 

r ! * i 

■ 

K 1 

VK 3 —0 0^- Vif 4 

Kd 


r ! 1 ■ 1 ! , 

K 3 K 4 


Fig. 4. AES-256: Computing K B , Kc, and Kd from Ka 


columns) of Kfj. From these eight consecutive columns we compute the full Kc- 
The key Kjj is computed from Kb in the same way. 

Finally, we point out that difference between Kc and Kb can be computed in 
the backward direction deterministically since there would be no active S-boxes 
till the first round. The secret key Ka, and the three keys K B , Kc, K D computed 
from Ka as described above form a proper related key quartet. Moreover, due 
to a slow diffusion in the backward direction, as a bonus we can compute some 
values in VK l even for i = 0,1, 2, 3 (see Tabled- Hence given the byte value 
k\ j for Ka we can partly compute K B , Kc and Kjj. 

Internal State. The plaintext difference is specified in 9 bytes. We require that 
all the active S-boxes in the internal state should output the difference Ox If so 
that the active S-boxes are passed with probability 2 -6 . The only exception is 
the first round where the input difference in nine active bytes is not specified. 
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Let us start a boomerang attack with a random pair of plaintexts that fit the 
trail after one round. Active S-boxes in rounds 3-7 are passed with probability 
2 -6 each, so the overall probability is 2 -30 . 

We switch the internal state in round 9 with the Ladder switch technique: 
the row 1 is switched before the application of S-boxes, and the other rows are 
switched after the S-box layer. As a result, we do not pay for active S-boxes at 
all in this round. 

The second part of the boomerang trail is quite simple. Three S-boxes in rounds 
10-14 contribute to the probability, which is thus equal to 2 -18 . Finally we get one 
boomerang quartet after the first round with probability 2“ 30-30-18-18 = 2 -96 . 


5.2 The Attack 

The attack works as follows. Do the following steps 2 25 " 5 times: 

1 . Prepare a structure of plaintexts as specified below. 

2. Encrypt it on keys Ka and K B and keep the resulting sets Sa and S B in 
memory. 

3. XOR Ac to all the ciphertexts in Sa and decrypt the resulting ciphertexts 
with Kc- Denote the new set of plaintexts by Sc- 

4. Repeat previous step for the set Sb and the key K D . Denote the set of 
plaintexts by S B - 

5. Compose from Sc and Sd all the possible pairs of plaintexts which are equal 



6. For every remaining pair check if the difference in j+q, i > 1 is equal on both 
sides of the boomerang quartet (16-bit filter). Note that Vfc° 7 = 0 so Akf 0 
should be equal for both key pairs ( Ka,K b ) and ( Kc,K D ). 

7. Filter out the quartets whose difference can not be produced by active S- 
boxes in the first round (one-bit filter per S-box per key pair) and active 
S-boxes in the key schedule (one-bit filter per S-box), which is a 2 • 2 + 2 = 6- 
bit filter. 

8. Gradually recover key values and differences simultaneously filtering out the 
wrong quartets. 

Each structure has all possible values in column 0 and row 0, and constant values 
in the other bytes. Of 2 72 texts per structure we can compose 2 144 ordered 
pairs. Of these pairs 2 144-8 ' 9 = 2 72 pass the first round. Thus we expect one 
right quartet per 2 96-72 = 2 24 structures, and three right quartets out of 2 25 5 
structures. 

Let us now compute the number of noisy quartets. About 2 144-56-16 = 
2 72 pairs come out of step El The next step applies a 6-bit filter, so we get 
272+25.5-6 2 91 - 5 candidate quartets in total. 

The remainder of this section deals with gradual recovering of the key and 
filtering wrong quartets. The key bytes are recovered as shown in Figure 0 
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5 0 

2 3 11? 

OD 5 

OD 5 0 


Fig. 5. Gradual key recovery. Digits stand for the steps, ’D’ means difference. 


1. First, consider 4-tuples of related key bytes in each position < 4. Two 

differences in a tuple are known by default. The third difference is unknown 
but is equal for all tuples (see Table El where it is denoted by X) and gets 
one of 2 7 values. We use this fact for key derivation and filtering as follows. 
Consider key bytes and The candidate quartet proposes 2 2 candi- 
dates for both 4-tuples of related-key bytes, or 2 4 candidates in total. Since 
the differences are related with the X-difference, which is a 9-bit filter, this 
step reveals two key bytes and the value of X and reduces the number of 
quartets to 2 91 - 5-5 = 2 86 5 . 

2. Now consider the value of Ak\ 0 , which is unknown yet and might be different 
in two pairs of related keys. Let us notice that it is determined by the value of 

7 , and V &2 7 = 0, so that /Afc 9 0 is the same for both related key pairs and 
can take 2 7 values. Each guess of Ak\ 0 proposes key candidates for byte 
k ,2 0 , where we have a 8-bit filter for the 4-tuple of related-key bytes. We 
thus derive the value of fc 9 0 in all keys and reduce the number of candidate 
quartets to 2 85 - 5 . 

3. The same trick holds for the unknown Ak 9 4 , which can get 2 7 possible values 
and can be computed for both key pairs simultaneously. Each of these values 
proposes four candidates for k ( ( , which are filtered with an 8-bit filter. We 
thus recover fc 9 4 and Ak ( ( 4 and reduce the number of quartets to 2 79 " 5 . 

4. Finally, we notice that Ak± 4 is completely determined by fc 9 0 , fc? i, fc 9 2 , 3 > 

and fc 9 7 . There are at most two candidates for the latter value as well as for 
Aki 4 , so we get a 6-bit filter and reduce the number of quartets to 2 72 5 . 

5. Each quartet also proposes two candidates for each of key bytes 0 , fc 9 i2 , 
and fc 9 3 . Totally, the number of key candidates proposed by each quartet 
is 2 6 . 

The key candidates are proposed for 11 bytes of each of four related keys. How- 
ever, these bytes are strongly related so the number of independent key bytes on 
which the voting is performed is significantly smaller than 11 X 4. At least, bytes 
kg g , fc 41 , fc 2 2 and & 3,3 of Ka and Kc are independent so we recover 15 key 
bytes with 2 78 " 5 proposals. The probability that three wrong quartets propose 
the same candidates does not exceed 2 -80 . 

We thus estimate the complexity of the filtering step as 2 77 " 5 time and memory. 
We recover 3 • 7 + 8 ■ 8 = 85 bits of of Ka (and 85 bits of Kc) with 2 99 5 data 
and time and 2 77 " 5 memory. 

The remaining part of the key can be found with many approaches. One is 
to relax the condition on one of the active S-boxes in round 3 thus getting four 
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more active S-boxes in round 2, which in turn leads to a full-difference state 
in round 1. The condition can be actually relaxed only for the first part of the 
boomerang (the key pair ( Ka,Kb )) thus giving a better output filter. For each 
candidate quartet we use the key bytes, that were recovered at the previous 
step, to compute AA 1 and thus significantly reduce the number of keys that are 
proposed by a quartet. We then rank candidates for the first four columns of 
K A and take the candidate that gets the maximal number of votes. Since we 
do not make key guesses, we expect that the complexity of this step is smaller 
than the complexity of the previous step (2 99 5 ). The right quartet also provide 
information about four more bytes in the right half of K\ that correspond to 
the four active S-boxes in round 2. The remaining 8 bytes of Ka can be found 
by exhaustive search. 


6 Attack on AES-192 

The key schedule of AES-192 has better diffusion, so it is hard to avoid active S- 
boxes in the subkeys. We construct a related-key boomerang attack with two sub- 
trails of 6 rounds each. The attack is an amplified-boomerang attack because we 
have to deal with truncated differences in both the plaintext and the ciphertext, 
the latter would be expensive to handle in a plain boomerang attack. 


6.1 The Trail 

The trail is depicted in Figure El and the actual values are listed in Tables 0 
and El The key schedule codeword is depicted in Figure El 


Table 4. Internal state difference in the AES-192 trail 
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Table 5. Key schedule difference in the AES-192 trail 


00 3e 3e 3/ 3e 01 
o 00 1/ If If 1/00 
00 1/ If If If 00 
7 21 21 21 21 00 

00 3e 00 3/ 01 00 
! 00 1/ 00 1/ 00 00 
00 1/ 00 1/ 00 00 
00 21 00 21 00 00 

00 3e 3e 01 00 00 

2 00 1/ 1/ 00 00 00 
00 1/ 1/ 00 00 00 
00 21 21 00 00 00 

00 3e 00 01 01 01 

3 00 If 00 00 00 00 
00 If 00 00 00 00 
00 21 00 00 00 00 

00 3e 3e 3/ 3e 3/ 
4 00 1/ 1/ 1/ 1/ 1/ 
00 1/1/1/1/1/ 


? ? ? 3e 3/ 3e 
o ??? 1/1/1/ 
???!/!/ lf 
? ? 7 7 21 21 

? ? 3/ 01 3e 00 

r ?? 1/00 1/00 

? ? 1/ 00 1/ 00 

7 7 7 00 21 00 

7 3e 01 00 3e 3e 

2 7 1/ 00 00 1/ 1/ 

7 1/ 00 00 1/ 1/ 

7 7 00 00 21 21 

3e 00 01 01 3/ 01 

3 1/ 00 00 00 1/ 00 
VA 1/ 00 00 00 1/ 00 

7 00 00 00 21 00 
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1/ 1/ 1/ 1/ 00 00 
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5 1/ 00 1/ 00 00 00 
VA 1/ 00 1/ 00 00 00 
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3e 3e 01 00 00 00 

6 1/ 1/ 00 00 00 00 
V 1/ 1/ 00 00 00 00 
21 21 00 00 00 00 

3e 00 01 01 01 01 

7 1/ 00 00 00 00 00 
1/ 00 00 00 00 00 
21 00 00 00 00 00 

3e 3e 3/ 3e 3/ 3e 

8 1/1/ 1/1/ 1/1/ 
1/ 1/ 1/ 1/ 1/ 1/ 


E 0 


Disturbance 



Correction 


Key schedule 
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Correction 


Key schedule 


Fig. 6. AES-192 key schedule codeword 


Related Keys. We define the relation between four keys similarly to the attack 
on AES-256. Assume we are given a key Ka, which the attacker tries to find. 
We compute its subkey K\ and apply the difference AK 1 to get the subkey Kg, 
from which the key Kb is computed. Then we compute the subkeys K\ and 
K% and apply the difference Vlf 4 to them. We get subkeys Kq and Kg, from 
which the keys Kc and Kg are computed. 

Now we prove that keys Ka, Kb, Kc, and Kg form a quartet, i.e. the subkeys 
of Kc and Kc satisfy the equations K l c ®K l n = AK 1 , l - 1,2,3. The only active 
S-box is positioned between K 3 and K 4 , whose input is fcjfy. However, this 
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Fig. 7. AES-256 Eg and E\ trails. Green ovals show an overlap between the two trails 
where the switch happens. 
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Fig. 8. AES-192 trail 
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S-box gets the same pair of inputs in both key pairs (see the “Feistel switch” in 
Sec. 14.211 . Indeed, if we compute Vfcjj.s from AK 4 , then it is equal to Ak 1 2 3 3T) = 
0x01. Therefore, if the active S-box gets as input a and a © 1 in Ka and Kb, 
respectively, then it gets a © 1 and a in Kc and K D , respectively. As a result, 
Kq © A| , = AA 3 , the further propagation is linear, so the four keys form 
a quartet. 

Due to a slow diffusion in the backward direction, we can compute some values 
in V l\ l even for small l (Table EJ). Hence given k[ 3 for Ka we can partly compute 
Kb, Kc and Kb, which provides additional filtration in the attack. 

Internal State. The plaintext difference is specified in 10 bytes BPS , the dif- 
ference in the other six bytes not restricted. The three active S-boxes in rounds 
2-4 are passed with probability 2 -6 each. In round 6 (the switching round) we 
ask for the fixed difference only in a(j 2 , the other two S-boxes can output any 
difference such that it is the same as in the second related-key pair. Therefore, 
the amplified probability of round 6 equals to 2 -6-2 ' 3 - 5 = 2 -13 . We switch be- 
tween the two trails before the key addition in round 6 in all bytes except bf ) 2 , 
where we switch after the S-box application in round 7 (the Ladder switch). This 
trick allows us not to take into account the only active S-box in the lower trail 
in round 7. The overall probability of the rounds 3-6 is 2 -3 ' 6-13 = 2 -31 . 

The lower trail has 8 active S-boxes in rounds 8-12. Only the first four active 
S-boxes are restricted in the output difference, which gives us probability 2 -24 
for the lower trail. The ciphertext difference is fully specified in the middle two 
rows, and has 35 bits of entropy in the other bytes. More precisely, each Vco,* is 
taken from a set of size 2 7 , and all the Vc 3| * should be the same on both sides 
of the boomerang and again should belong to a set of size 2 7 . Therefore, the 
ciphertext difference gives us a 93-bit filter. 


6.2 The Attack 

We compose 2 73 structures of type with 2 48 texts each. Then we encrypt 
all the texts with the keys Ka and Kc, and their complements w.r.t. AP on 
Kb and Kb- We keep all the data in memory and analyze it with the following 
procedure: 

1. Compose all candidate plaintext pairs for the key pairs ( Ka,Kb ) and 
(K c ,K d ). 

2. Compose and store all the candidate quartets of the ciphertexts. 

3. For each guess of the subkey bytes: kg 3 , fc 23 , and k{j 5 in Ka; kg 5 in Ka 
and Kb'- 

(a) Derive values for these bytes in all the keys from the differential trail. 
Derive the yet unknown key differences in AK° and VA 8 . 

(b) Filter out candidate quartets that contradict VA 8 . 

(c) Prepare counters for the yet unknown subkey bytes that correspond to 
active S-boxes in the first two rounds and in the last round: /c{| 0 , k§ 1} 
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ki 2 , ^ 3,0 — i n keys Ka and Kc , fc 8 0 , kg tl , /c 8 2 > ^ 0,3 — i n keys Ka and 
Kb, i.e. 16 bytes in total. 

(d) For each candidate quartet derive possible values for these unknown 
bytes and increase the counters. 

(e) Pick the group of 16 subkey bytes with the maximal number of votes. 

(f) Try all possible values of the yet unknown 9 key bytes in K° and check 
whether it is the right key. If not then go to the first step. 

Right quartets. Let us first count the number of right quartets in the data. 
Evidently, there exist 2 128 pairs of internal states with the difference A A' 2 . 
The inverse application of 1.5 rounds maps these pairs into structures that we 
have defined, with 2 48 pairs per structure. Therefore, each structure has 2 48 
pairs that pass 1.5 rounds, and 2 73 structures have 2 121 pairs. Of these pairs 
2(i2i-3i)-2-i28 _ 2 52 right quartets can be composed after the switch in the 
middle. Of these quartets 2 52-2 ' 24 = 16 right quartets come out of the last round. 

Now we briefly describe the attack. Full details will be published in the ex- 
tended version. In steps 1 and 2 we compose 2 152 candidate quartets. The guess 
of five key bytes gives a 32-bit filter in step 3, so we leave with 2 120 candidate 
quartets, which are divided according to Vca.o into 2 14 groups. Then we perform 
key ranking in each group and recover 16 more key bytes. The exhaustive search 
for the remaining 9 key bytes can be done with the complexity 2 72 . The overall 
time complexity is about 2 176 , and the data complexity is 2 123 . 

7 Conclusions 

We presented related-key boomerang attacks on the full AES-192 and the full 
AES-256. The differential trails for the attacks are based on the idea of finding 
local collisions in the block cipher. We showed that optimal key-schedule trails 
should be based on low-weight codewords in the key schedule. We also exploit 
various boomerang-switching techniques, which help us to gain free rounds in 
the middle of the cipher. However, both our attacks are still mainly of theoretical 
interest and do not present a threat to practical applications using AES. 
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Disclaimer on colors. We intensively use colors in our figures in order to 
provide better understanding on the trail construction. In figures, different colors 
refer to different values, which is hard to depict in black and white. However, 
we also list all the trail differences in the tables, so all the color information is 
actually dubbed. 

Trail details. By AA l we denote the upper trail difference in the internal state 
after the S-box layer, and by VA7 the same for the lower trail. 
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Abstract. In this paper, we formalize an attack scheme using the key- 
dependent property, called key-dependent attack. In this attack, the in- 
termediate value, whose distribution is key-dependent, is considered. The 
attack determines whether a key is right by conducting statistical hy- 
pothesis test of the intermediate value. The time and data complexity of 
the key-dependent attack is also discussed. 

We also apply key-dependent attack on reduced-round IDEA. This 
attack is based on the key-dependent distribution of certain items in 
Biryukov-Demirci Equation. The attack on 5.5-round variant of IDEA 
requires 2 21 chosen plaintexts and 2 112 ' 1 encryptions. The attack on 
6-round variant requires 2 49 chosen plaintexts and 2 112 ' 1 encryptions. 
Compared with the previous attacks, the key-dependent attacks on 5.5- 
round and 6-round IDEA have the lowest time and data complexity, 
respectively. 

Keywords: Block Cipher, Key-Dependent Attack, IDEA. 


1 Introduction 

In current cryptanalysis on block ciphers, widespread attacks use special proba- 
bility distributions of certain intermediate values. These probability distributions 
are considered as invariant under different keys used. For example, differential 
cryptanalysis jZj makes use of the probability of the intermediate differential 
with high probability. Its value is assumed not to vary remarkably with different 
keys. Linear cryptanalysis m is based on the bias of the linear approximation, 
which is also generally constant for different keys. 

Instead of concentrating on the probability distribution which is invariant for 
different keys, Ben-Aroya and Biham first proposed the key-dependent prop- 
erty in j2j. Key-dependent property means that the probability distribution of 
intermediate value varies for different keys. In j2j, an attack on Lucifer using 
key-dependent differential was presented. Knudsen and Rijmen also used similar 
idea to attack DFC in (2D| ■ 
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In this paper, we consider the key-dependent property further. The distribu- 
tion of intermediate value which is key-dependent is called key-dependent dis- 
tribution. Assume that there are some randomly chosen encryptions. For the 
intermediate values calculated from these encryptions with the actual key, they 
should conform to key-dependent distribution. On the other hand, if we use a 
wrong key to calculate the intermediate values, they are assumed to conform 
to random distribution. Basing on key-dependent distribution, we formalize a 
scheme of discovering the actual key by performing statistical hypothesis test 
El on possible keys, and we call this scheme key- dependent attack. For a given 
key, the null hypothesis of the test is that the intermediate value conforms to the 
key-dependent distribution determined by the key. The samples of the test are 
the intermediate values calculated from a few encryptions. If the test is passed, 
the given key is concluded to be the actual key, otherwise it is discarded. For the 
keys that share the same key-dependent distribution and the same intermediate 
value calculation, the corresponding hypothesis tests can be merged to reduce 
the time needed. By this criterion, the whole key space is divided into several 
key-dependent subsets. 

Due to the scheme of the key-dependent attack, the time complexity of the 
attack is determined by the time for distinguishing between the random dis- 
tribution and the key-dependent distribution. The time needed relies on the 
entropy of the key-dependent distribution: the closer the key-dependent distri- 
bution is to the random distribution, the more encryptions are needed. For each 
key-dependent subset, the number of encryptions and the criteria of rejecting 
hypothesis can be chosen so that the attack on this subset is optimized. The 
expected time of the attack on each subset is also obtained. 

The total expected time complexity can be calculated from the expected time 
on each key-dependent subset. Different orders of the key-dependent subsets 
attacked have different expected time complexities. The order with minimal 
expected time complexity is presented. The total expected time complexity is 
also minimized in this way if the actual key is supposed to be chosen uniformly 
from the whole key space. 

This paper also presents a key-dependent attack on block cipher 
IDEA. The block cipher IDEA (International Data Encryption Algorithm) 
was proposed in [211221 . The cryptanalysis of IDEA was discussed in 
[115141516181111 111 211 dll 411 511 fill 811 1124I25[ , and no attack on full version IDEA 
is faster than exhaustive search so far. We investigate the Biryukov-Demirci 
Equation, which is widely used in recent attacks on IDEA [I I5ltil I Mil til I 8j . We 
find that particular items of Biryukov-Demirci Equation satisfy key-dependent 
distribution under some specific constraints. This makes it possible to perform 
the key-dependent attack on IDEA. Biryukov-Demirci Equation is used to re- 
cover the intermediate values from encryptions. 

Our key-dependent attack on 5.5-round variant of IDEA requires 2 21 chosen 
plaintexts and has a time complexity of 2 1121 encryptions. Our key-dependent 
attack on the 6-round variant of IDEA requires 2 49 chosen plaintexts and has 
a time complexity of 2 1121 encryptions. These attacks use both fewer chosen 
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Table 1. Selected Results of attacks on IDEA 


Rounds Attack type Data Time Ref. 


4.5 

Impossible Differential 2 b4 CP 

1^ 

□ 

4.5 

Linear 16 CP 

2 103 

0 

5t 

Meet- in-the- Middle 2 24 CP 

2 126 

H3| 

5t 

Meet-in-the-Middle 2 24 ' 6 CP 

2 124 

m 

5 

Linear 2 18 5 KP 

2 103 

0 

5 

Linear 2 19 KP 

2 103 

0 

5 

Linear 16 KP 

2 114 

0 

5.5 

Higher-Order Differential-Linear 2 32 CP 

2 126.85 

0 

6 

Higher-Order Differential-Linear 2 64 — 2 52 KP 

2 126.8 

0 

5 T 

Key-Dependent 2 JY CP 

2 -L25.5 

Section 15.21 

5t 

Key-Dependent 2 64 KP 

2 115.3 

Section 15.21 

5.5 

Key-Dependent 2 21 CP 

2 112 ' 1 

Section 15. II 

6 

Key-Dependent 2 49 CP 

2 112 ' 1 

Section 15.21 


CP - Chosen Plaintext, KP - Known Plaintext, 
t Attack on IDEA starting from the first round. 

plaintexts and less time than all the previous corresponding attacks. We also 
give two key-dependent attacks on 5-round IDEA starting from the first round. 
One requires 2 17 chosen plaintexts and needs 2 125 ' 5 encryptions. The other one 
requires 2 64 known plaintexts and needs 2 115 ' 3 encryptions. We summarize our 
attacks and previous attacks in Table 0 where the data complexity is measured 
in the number of plaintexts and the time complexity is measured in the number 
of encryptions needed in the attack. 

The paper is organized as follows: In Section 0 we give a general view of the 
key-dependent attack. In Section 0 we give a brief description of IDEA block 
cipher. In Section 0 we show that the probability distribution of some items of 
the Biryukov-Demirci Equation is a key-dependent distribution. In Section 0 we 
present two key-dependent attacks on reduced-round IDEA. Section 0 concludes 
this paper. 

2 The Key-Dependent Attack 

In [2 , Ben- Aroya and Biham first proposed the key-dependent property and im- 
plemented a key-dependent differential attack on Lucifer. Knudsen and Rijmen 
also used similar idea to attack DFC in 123- 

In this section, we formalize a scheme of identifying the actual key using the 
following key-dependent property (with high success probability). 

Definition 1. For a block cipher, if the probability distribution of an interme- 
diate value varies for different keys under some specific constraints, then this 
probability distribution is defined as key-dependent distribution. 


22 


X. Sun and X. Lai 


Consider some randomly chosen encryptions satisfying the specific constraints. 
If one uses the actual key to calculate the intermediate value, it should conform 
to key-dependent distribution. If one uses a wrong key to calculate the inter- 
mediate value, it is assumed to be randomly distributed. With such a property, 
determining whether a given key is right can be done by distinguishing which 
distribution the intermediate value conforms to, the key-dependent distribution 
or the random distribution. 

We propose an attack scheme, called key-dependent attack, using key-dependent 
distribution. The attack uses statistical hypothesis test, whose idea is also used 
in differential and linear attack m to distinguish between key-dependent dis- 
tribution and random distribution. For a key, the null hypothesis of the test is 
that the intermediate value conforms to the key-dependent distribution deter- 
mined by the key. Then the attack uses some samples to determine whether the 
hypothesis is right. The samples of the statistical hypothesis test are the inter- 
mediate values obtained from the encryptions satisfying the specific constraints. 
If the key passes the hypothesis test, the attack concludes that the key is right, 
otherwise the key is judged to be wrong. 

For the keys that share the same key-dependent distribution and the same in- 
termediate value calculation, the corresponding hypothesis tests can be merged. 
Hence the whole key space is divided into several key-dependent subsets. (Similar 
idea is proposed in j2j.) 

Definition 2. A key-dependent subset is a tuple ( P , U), where P is a fixed key- 
dependent distribution of intermediate value, and U is a set of keys that share the 
same key-dependent distribution P and the same intermediate value calculation. 

Definition 3. The key fraction (f) of a key-dependent subset is the ratio be- 
tween the size of U and the size of the whole key space. 

The key-dependent attack determines which key-dependent subset the actual key 
is in by conducting hypothesis tests on each key-dependent subset. Such process 
on a key-dependent subset ( P,U ), called individual attack, can be described as 
the following four phases: 

1. Parameter Determining Phase Determine the size of the samples and 
the criteria of rejecting the hypothesis that the intermediate values conform 
to P. 

2. Data Collecting Phase Randomly choose some encryptions according to 
the specific constraints^] 

3. Judgement Phase Calculate the intermediate values from the collected 
encryptions. If the results satisfy the criteria of rejection, then discard this 
key-dependent subset, otherwise enter the next phase. 

4. Exhaustive Search Phase Exhaustively search U to find the whole key. If 
the exhaustive search does not find the whole actual key, then start another 
individual attack on the next key-dependent subset. 

1 Though each individual attack chooses encryptions randomly, one encryption can be 
used for many individual attacks thus to reduces the total data complexity. 
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The time complexity of the key-dependent attack is determined by the time 
complexity of each individual attack and the order of performing these individual 
attacks. 

For a key-dependent subset (P, U), the time needed for individual attacks re- 
lies on the entropy of P: the closer P is to the random distribution, the more diffi- 
cult the attack is — to ensure the same probability of making the right judgement, 
the attack needs more encryptions. This indicates that individual attacks for 
different key-dependent subsets have different time complexities. The time com- 
plexity of each individual attack is determined by corresponding key-dependent 
distribution P. For each key-dependent subset, the number of encryptions and 
the criteria of rejecting hypothesis are then chosen to minimize the time com- 
plexity of this individual attack. 

To minimize the time complexity of an individual attack, the attack should 
consider the probability of committing two types of errors: Type I error and 
Type II error. Type I error occurs when the hypothesis is rejected for a key- 
dependent subset while in fact the actual key is in U, and the attack will fail to 
find the actual key in this case. The probability of Type I error is also defined as 
significant level, denoted as a. Type II error occurs when the test is passed while 
in fact it is not right, and in this case the attack will come into the exhaustive 
search phase, but will not find the actual key. The probability of Type II error is 
denoted as /3. With a fixed size of samples (denoted as N) and the significance 
level a, the criteria of rejecting the hypothesis is determined, and the probability 
of Type II error /3 is also fixed. For a fixed size of samples, it is impossible to 
reduce both a and 3 simultaneously. In order to reduce both a and (3 , the attack 
has to use a larger size of samples, but time and data complexity will increase. 
Hence, an individual attack needs to balance between the size of samples, and 
the probability of making wrong judgement. 

For a key-dependent subset ( P,U ), if the actual key is not in this subset, 
the expected time complexity (measured by the number of encryptions) of the 
individual attack on this subset is 


W = N + 0\U\ 


(1) 


If the actual key is in this subset, the expected time of the individual attack on 
this subset is 


R = 


N+(l-a) 


W\ 

2 


Since the time complexity is dominated by attacking on wrong key-dependent 
subsets (there is only one key-dependent subset containing the actual key), the 
attack only needs to minimize the time complexity of the individual attack for 
each wrong key-dependent subset to minimize the total time complexity. Al- 
though a does not appear in Equation JU, a affects the success probability of 
the attack, so a should also be considered. We set one upper bound of a to 
ensure that the success probability is above a fixed value, and then choose such 
size of samples that Equation (JU is minimized, in order to minimize the time 
complexity of individual attacks. 
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In addition, it is entirely possible that some key-dependent distributions is 
so close to random distribution that the expected time for performing hypoth- 
esis tests is longer than directly searching the subsets. For these key-dependent 
subsets, the attack exhaustively searches the subset directly instead of using 
statistical hypothesis test method. 

On the other hand, the time complexity of the key-dependent attack is also 
affected by the order of performing individual attacks on different key-dependent 
subsets. Because the expected time complexities of individual attacks are differ- 
ent, different sequences of performing individual attacks result in different total 
expected time complexity. Assume that a key-dependent attack performs individ- 
ual attacks on m key-dependent subsets in the order of (Pi, Pi), . . . , (P m , U rn ). 
Let Ri denote the expected time for (Pj, P,;) if the actual key is in Pj, and Wj 
denote the expected time if the actual key is not in Pj. We have following result: 


Theorem 1. The expected time for the whole key-dependent attack is minimal 
if the following condition is satisfied 


A 

Wi 


h . 

W 2 '~ 


Proof. The expected time of the attack in the order of (Pi, Pi), ... , (P m , P TO ) is 


=fi[Ri + a(W 2 + W 3 + ■ • • + Wm)\ + f 2 [Wi + R 2 + a(W 3 + • • • + W m ))] 

+ / 3 [Wl +W2+R3 + a(W 4 + . . . W m )] + • • • + f m {Wl +W 2 +... Wm- 1 + Rm) 

= f>^ + E W i) + «X> E W i) 


(2) 

If the attack is performed in the order of (P Sl , P Sl ), (P, 2 , P S2 ), . . . , (P Sm , P Sm ), 
where si, s 2 , ■ ■ ■ , s m is a permutation of 1, 2, . . . , m. The expected time is 

^ = E f^ + E a E +a E(/^ E w »i) 

*= 1 *= 1 * =1 j — i+l 

fiWj + afjWi occurs in <P if and only if j < i and occurs in P if and only if 
j' < i! where Sjj = i and Sji = j. Hence 

$ - & =• E (fiWj + afj Wi - fj Wi - afiWj) 

j<i and j’>i' 

Since a < 1 and /, W j — fjWi < 0 for j < i, <P — <P' < 0 for any permutation 

Sl,S2,...S m . □ 

In the following sections of this paper, we present a concrete key-dependent 
attack on the block cipher IDEA. 


The Key-Dependent Attack on Block Ciphers 


25 


3 The IDEA Block Cipher 

In this section, we give a brief introduction of IDEA and notations used later in 
this paper. 

IDEA block cipher encrypts a 64-bit plaintext with a 128-bit key by an 8.5- 
round encryption. The fifty-two 16-bit subkeys are generated from the 128- 
bit key Z by key-schedule algorithm. The subkeys are generated in the order 
Z},Z^,...,Zg,Zf,...,Zg, Zf,...,Z 4 . The key Z is partitioned into eight 16-bit words 
which are used as the first eight subkeys. The key Z is then cyclically shifted to 
the left by 25 bits, and then generate the following eight subkeys. This process 
is repeated until all the subkeys are obtained. In Table El the correspondence 
between the subkeys and the key Z is directly given. 

The block cipher partitions the 64-bit plaintext into four 16-bit words and 
uses three different group operations on pairs of 16-bit words: exclusive OR, 
denoted by ®; modular addition 2 16 , denoted by EE and modular multiplication 
2 16 + 1(0 is treated as 2 16 ), denoted by 0. 

As Figure^ each round of IDEA contains three layers: KA layer, MA layer and 
Permutation layer. We denote the 64-bit input of round i by X* = (XJ, X\,X\, X\). 
In the KA layer, the first and the fourth words are modular multiplied with Z\ and 
Z\ respectively. The second and the third words are modular added with Z\ and 
Z\ respectively. The output of the KA layer is denoted by Y l = (Y 4 , Y.] , Y 3 ®, Y 4 ). 

In the MA layer, two intermediate values p l = Yf 0 YJ and q l = Y] 0 Yl are 
computed first. These two values are processed to give u l and t l , 

u z = (p z 0 Zl) ffl t 

t* = ((p* © 4 ) ffl ) © 4 

We denote s l the intermediate value p % © Z\ for convenience. The output of the 
MA layer is then permutated to give the output of this round (Yjf © u l . YJ © 
which is also the input of round i + 1 , denoted by (X| + 1 , Xp 1 , 
Xg +1 , Xp 1 ). The complete diffusion, which means every bit of (XJ +1 , X^ +1 , X-p 1 , 
X_pp is affected by every bit of (P ? ' , YJ , Tp Y 4 l ), is obtained in the MA layer. 


Table 2. The Key-Schedule of IDEA 


Round 


Zl 


Z\ 


Z% Z\ Zl Zl 


1 0-15 16-31 32-47 48-63 64-79 80-95 

2 96-111 112-127 25-40 41-56 57-72 73-88 

3 89-104 105-120 121-8 9-24 50-65 66-81 

4 82-97 98-113 114-1 2-17 18-33 34-49 

5 75-90 91-106 107-122 123-10 11-26 27-42 

6 43-58 59-74 100-115 116-3 4-19 20-35 

7 36-51 52-67 68-83 84-99 125-12 13-28 

8 29-44 45-60 61-76 77-92 93-108 109-124 

9 22-37 38-53 54-69 70-85 
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XI X l 2 X\ XI 



Fig. 1 . Round i of IDEA 


In this paper, we will use P = (Pi, P 2 , P3, P4) and P' = (P{, P' 2 , P3, P4) to de- 
note a pair of plaintexts, where Pi and P- are 16-bit words. C = (Cy . Cg, C3, C4) 
and C' = (C(, C 2 , C3, 64) are their ciphertexts respectively. We also use the sym- 
bol ' to distinguish the intermediate values corresponding to P' from to P. For 
example, s l is obtained from plaintext P and P' will generate s' 1 . The notation 
A will denote the XOR difference, for instance, As 1 is equal to s l © s' 1 . 

4 The Key-Dependent Distribution of IDEA 

In this section, we describe the key-dependent distribution of the block cipher 
IDEA, which will be used in our attack later. The notations used are the same 
as in [6. 

The Biryukov-Demirci relation was first proposed by Biryukov [XB! and 
Demirci m, Many papers have discussed attacking on IDEA using this re- 
lation, such as | 1 1.61611 dll 611 8) . The relation can be written in following form 
( LSB denotes the least significant bit) 
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LSB{C 2 © 3 ,) =LSB(P 2 0 P 3 0 Z\ 0 Zl 0 s 1 © Z\ 0 Z\ 0 s 2 

0 Z^ 0 Zq 0 s 3 0 Z 2 0 Zq 0 0 Z2 0 Zq 0 s 3 

0 Z2 0 Z 3 0 s 3 0 Z2 0 Zg 0 s 7 0 Z 8 0 Z 8 0 s 8 

®zi®zi) 


It is shown in |S] that, for two pairs of plaintext and ciphertext ( P , C) and 
(P',C r ), XOR their corresponding Biryukov-Demirci relation, we will obtain 
from Equation © 

LSB(C 2 0 C3 0 C2 0 C£) =LSB(P 2 0 P 3 0 P 2 0 H 0 0 -4s 2 

0 Z\s 3 0 As 4 0 As 5 0 As 6 0 As 7 0 Z\s 8 ) ^ 


We call Equation 0 Biryukov-Demirci Equation. 

The following theorem shows that the probability distribution of LSB{As l ) 
in Biryukov-Demirci Equation is a key-dependent distribution. 

Theorem 2. Consider round i of IDEA. If one pair of intermediate value (p l ,p n ) 
satisfies Ap l = 8000 x , then the probability of LSB(As l ) = LSB( 8000a, 0 Z\) is 

Prob{LSB{As i ) = LSB{ 8000* 0 Z|)) = K (5) 

where W is the set of all such 16 -bit words w that 1 < w < 8000a, and that 

{w * Z|) + (8000a, * Z|) < 2 16 + 1 


where * is defined as 


a*b = 


aQbifaQb^O 
2 16 ifa®b= 0 


Proof. Consider every intermediate pair (p l ,p") which satisfies Ap 1 = 8000a,, 
excluding (0, 8000a,). We have p ’ 1 = p'+8000 x or p l = p n + 8000a,. Without losing 
generality, assume p n = p* + 8000. x , where 1 <p l < 8000a, and 8000 x < p n < 2 16 . 

If we consider only the least significant bit, LSB(s l ) = LSB(p l * Z|). The 
following equations also hold 


LSB(s h ) =L5B(p ,i 0 Zl) 

=LSB(p'UZi) 

=L5B((p i + 8000a,) * Zj) 

=L5B(((p i * Zf) + (8000a, * 4)) (mod 2 16 + 1)) 

In the special case when (p l ,p' 1 ) is (0,8000a,), let p l = 8000a,, and p ' 1 = 0. The 
Equations (0 also holds, because p n = 0 is actually treated as 2 16 for inputs of 
0 and *. 
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Fig. 2. The key-dependent distribution of Prob(LSB(As) = 1) on the value of Z\ 

If (p l * Z\) + (8000a, * Zl) is smaller than 2 16 + 1, then LSB(s n ) = LSB(s z ) © 
LSB (8000a; * Zl) holds because of the equivalence of XOR and modular addition 
for the least significant bit. Moreover, LSB(As % ) = LSB(8QQQ X * Zl) is satisfied, 
which means LSB(As l ) = LSB(8000 X 0 Zl) 

Otherwise, LSB(s n ) is equal to LSB(s t ) © LSB(8000 X * Zl) © 1 because of 
the carry. So LSB(As l ) equals to LSB(8000 X © Zl) © 1. 

Therefore, we may conclude that LSB(As l ) = LSB(8000 X 0 Zl) if and only 
if the pair (p z ,p n ) satisfies (w * Zl) + (8000a, * Z\) < 2 16 + 1, where w is either p l 
or p n , whichever between 1 and 8000a,. And there are at most 2 15 such w, hence 
Equation (0 holds. This completes the proof. □ 

Remark 1. Figure El plots the relation between the subkey Zl and the proba- 
bility of LSB{As l ) — 1. As shown in Figure El for most Z\, the probability of 
LSB (As 1 ) = 1 is different from random distribution. Hence, it is possible to 
perform key-dependent attack on IDEA using this key-dependent distribution. 

For most Zl, there are general four cases for the probability of LSB(As l ) = 1 
as Zl grows from 0 to 2 16 — 1, which can be roughly approximated as following: 

last two bits of Z\ = 00 
last two bits of Z\ = 01 
last two bits of Z\ = 10 
last two bits of Z\ = 11 


f# 

Prob(LSB(As i ) = i) « J ^ ^ 
[o.5+ Jfr 


( 7 ) 
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From Equation 0, following approximation also holds for most Z\ 


n{Prob{LSB{As i ) = 0 ),Prob(LSB(As i ) = 1)} 


„ / 

^ \o.5 — 0 ?, 


LSB(Zi)-- 

LSB(Zi) 


- 1 

(8) 

Calculation shows that, for only 219 out of all 2 16 possible Z 3 , the difference 
between the approximation (Equation (01 or (0) and the accurate provability is 
larger than 0.01. 

Equation Q indicates that we can approximate left hand side of Equation ® 
by fixing several most significant bits and the least significant bit. In following 
sections, we will show that we only need to distinguish the approximate probabil- 
ity distribution from random distribution. Hence, for most Z\, this approxima- 
tion is close enough to the accurate value. For Z 3 that can not be approximated 
in this way, we use other methods to deal with this situation. 


5 The Key-Dependent Attack on IDEA 

In this section, we will present two key-dependent attacks on reduced-round 
IDEA. In Section 15.11 we will give a basic attack on the 5.5-round variant of 
IDEA and then extend it to 6-round variant in Section FT2I We also give two key- 
dependent attacks on 5-round IDEA starting from the first round in Section IH*T?1 

5.1 The Attack on 5.5-Round Variant of IDEA 

We first present one key-dependent attack on the 5.5-round variant of IDEA. 
The attack starts from the third round and ends before the MA layer of the 
eighth round. The main idea of this attack is to perform key-dependent attack 
based on the key-dependent distribution of As 4 described in Theorem 0 

Consider the 5.5-round variant of IDEA starting from the third round, the 
Biryukov-Demirci Equation can be rewritten as 

LSB(As 4 ) = LSB(P 2 ®P 3 ®P 2 ®P 3 ®C 2 ®C 3 ®C' 2 ®C , 3 ®As 3 ®As 5 ®As 6 ®As 7 ) 

(9) 

Where P and P ' are equivalent to X 3 and X' 3 , C and C’ are equivalent to Y 8 
and Y' 8 by the variant of IDEA. 

We first construct a pair of plaintexts satisfying the specific constraint Ap 4 = 
8000a;. The construction is based on the following lemma. 

Lemma 1. For any a, if two 16-bit words x and x' have the same least 15 
significant bits, then 

• x® a and x’ ® a have the same least 15 significant bits, 

• x EH a and x' EH a have the same least 1 5 significant bits. 

Based on Lemma 0 the following proposition can be obtained. 
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Proposition 1. If a pair of intermediate values Y 3 andY' 3 satisfy the following 
conditions: 


a. AY 3 = AY 3 = 0 

b. AY 3 = 8000 x 

c Y 3 ® Y 3 = Yf 3 ® Yf 3 

then As 3 = 0 and the probability of LSB(As 4 ) = 0 can be determined by 
Equation 0. 

Proof. From Condition (a), AY 3 = AY. 3 = 0, p 3 is equal to p' 3 . Then As 3 = 0 
is quite straightforward. 

From Condition (c), q 3 is equal to q' 3 . If p 3 and q 3 are fixed, u 3 and t 3 are 
also fixed with respect to any Z 3 and Zf. It indicates that Xf = Y 3 ® u 3 = X'f. 
Note that Y 4 and Y{ 4 are the results of modular-multiplying Xf and X'f with 
the same Zf, hence Y 4 is equal to Y( 4 . 

On the other hand, AYf = 8000. x . means that the least significant 15 bits of 
Y 3 are equal to those of Yf 3 and the most significant bit of Y 3 and that of Yf 3 are 
different. Because u 3 is fixed, by LemmaQ the least significant 15 bits of Xf are 
equal to those of X'f. Then AXf is equal to 8000 x and AY.f = 8000 x is obtained 
by modular addition with the same Zf. From AY 4 = 0 and AY- 4 = 8000 x , Ap 4 
is 8000 x . By Theorem El the conclusion is obtained. □ 

In our attack, we use the plaintext pairs satisfying Proposition 0 We obtain 
Condition (a) by letting AT\ = AI\ = 0. By Lemma El P-i and Pj are fixed 
to have the same least significant 15 bits, and hence AY.f = 8000 x . In order to 
fulfill Condition (c), we have to guess Z 3 and then according to this guess, to 
choose P 4 and P' A which satisfy AY 3 = 8000 x . 

By Proposition [IJ Z\.s 3 is equal to zero. In order to get the right hand side 
of Equation (0, we still need to get As 5 , As 6 , As 7 . We need to guess Zf, Zf, 
Zf, Zf, Zf, Zf, Zl, Zf, Z\ Z\, Zl, Zf, Zf, Zf, Zf. As shown in 0 one can 
partially decrypt one pair of encryptions using these 15 subkeys to calculate the 
values of As 5 , As 6 , As 7 . These 15 subkeys only take key bits 125-99 and also 
cover the subkey Z 3 . Hence, for one guessed 103 key bits, we can calculate the 
value of As 4 from a special pair of encryptions. 

We also note that these 103 bits also cover the key Z 4 , which determine the 
key-dependent distribution on As 4 according to Theorem El Therefore, we can 
perform the key-dependent attack on 5.5-round variant of IDEA. As described 
in Section 2, the key space can be divided into 2 103 key-dependent subsets by 
the 103 key bits, each contains 2 25 keys. 

For a key-dependent subset (P, U), let p denote the probability of LSB{As 4 ) = 
LSB( 8000 x © Zf). For simplicity, in the following analysis, we assume that p < 
0.5, the case when p > 0.5 is similar. Assume the size of the samples is n pairs of 
encryptions that satisfy the specific constraint on this key-dependent subset, and 
t of them satisfy LSB(As 4 ) = LSB( 8000 x O Z 4 ). The criteria for not rejecting 
the hypothesis is that t is smaller or equal to a fixed value k. The probability of 
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Type I error is 

«= \ n )p\ l -p) n ~ l 

i=k + 1 ' ' 

Type II error is 



If (P, U ) is a wrong key-dependent subset, the expected time complexity of 
checking this subset is 

W = 2n + 2 25 (3 (10) 

As shown in Section 2, the attack sets a smaller than or equal to 0.01 to ensure 
that the probability of the false rejection will not exceed 0.01. Under this pre- 
condition, the attack chooses n and (3 so that a < 0.01 and minimizes Equation 
(I I 1 111 to minimize the time complexity on each key-dependent subset (P, U). By 
Section |21 we minimize the total expected time complexity with this method. 
Because this choice is related only to the key Z$, so we only need to get n and 
k for 2 16 different values. 

For example, for a key-dependent subset (P, U) with Zf } = 800 l x , p is about 
0.666687. The attack checks every possible n and k to find the minimized ex- 
pected time complexity of the individual attack for this subset. As shown in 
Section 2, the expected time complexity for each subset is upper bounded by ex- 
haustive search on the subset, which is 2 25 in this attack. Hence, the attack only 



Fig. 3. The number of encryptions used and expected time complexity for individual 
attacks 


32 


X. Sun and X. Lai 


checks all the n and k smaller than 2 25 . The expected time is minimized with 
precondition a < 0.01 when n = 425 and k = 164. In this case, a = 0.009970, 
p = 0.000001 and W = 899.094678. 

Since all the key-dependent subsets have the same key fraction, the order 
of performing individual attacks with minimal expected time complexity be- 
comes the ascending order of W for all key-dependent subsets due to Theorem [0 
Figure 0 plots the number of encryptions used and expected time complexity for 
all the individual attacks. 

The total expected time complexity of the attack, described as Equation ( 0 ) , 
becomes 

*= E m +“X> E w,) 

tel tel j = 1 ted! j=i + 1 

=5®(E R ‘ + Ei>>+ (I - 01 E E w 

i= 1 tel j=l tel j=i+l 

^£ ! ” t EE»i+* 01 E E w i> 

= jljf 2 ’ 03 ■ 2 26 + £(2 103 - i + O.OliJW 3 ,) 

«2 112 - 1 

with 99% success probability if the attack chooses n and 9 for each key-dependent 
set and determines the order of performing individual attacks as shown above. 
The number of pairs needed in one test is about 2 19 in the worst case. The attack 
uses a set of 2 21 plaintexts, which can provide 2 20 plaintext pairs satisfying the 
conditions in Proposition 1 for each key-dependent subset. 

The attack is summarized as follows: 

1. For every possible Z$, calculate the corresponding number of plaintext pairs 
needed n and the criteria of not rejecting the hypothesis k. 

2. Suppose S is an empty set. Randomly enumerate a 16-bit word s, insert s 
and s ® 8000 x into the set S. Repeat this enumeration until set S contains 
2 5 different words. Ask for the encryption of all the plaintexts of the form 
( A,B,C,D ), where A and C are fixed to two arbitrary constants, B takes 
all the values in S and D takes all the 16-bit possible values. 

3. Enumerate the key-dependent sets in ascending order of W: 

(a) Randomly choose a set of plaintext pairs with cardinality n from the 
known encryptions. The plaintext pairs must satisfy the requirements of 
Proposition 0 

(b) Partially decrypt all the selected encryption pairs and count the occur- 
rence of LSB(Asa ) = 1. 

(c) Test the hypothesis. If the hypothesis is not rejected, perform exhaustive 
search for the remaining 25 key bits. 
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5.2 The Attack on 6-Round Variant of IDEA 

We now extend the 5.5-round attack to an attack on the 6-round variant of 
IDEA starting before the MA layer of the second round. The data complexity 
of the attack is 2 49 and the time complexity is 2 1121 . 

As shown in 0, Zf and Zf are included in the 103 key bits in the 5.5- 
round attack. Hence, we can add this half round to the 5.5-round attack without 
enlarging the time complexity. 

It is more difficult to construct right plaintext pairs satisfying Proposition Q 
Consider a pair of intermediate values X 3 and X' 3 before the third round, which 
satisfy Proposition 1. If we partially decrypt X 3 and X' 3 using any possible Zf 
and Zf, the only fact we know is that all the results have the same XOR of 
the first and third words. The attack hence selects all the plaintexts P where 
the least 15 significant bits of Pi ® P3 are fixed to an arbitrary 15-bit constant. 
The total number of selected plaintexts is 2 49 . It is possible to provide 2 48 plain- 
text pairs satisfying the conditions in Proposition 1 in the test for any Zf , Zf 
and Zf. This number is sufficient in any situation. 

5.3 Two Key-Dependent Attacks on 5-Round IDEA Starting from 
the First Round 

We apply the key-dependent attack to the 5-round IDEA starting from the first 
round. Biryukov-Demirci Equation is reduced to 

LSB(As 2 ) = LSB(P 2 ® P 3 © Pf ® Pf © C 2 
©C3 © C' 2 © C3 © As 1 © As 3 © As 4 © As 5 ) 

We choose the plaintext pairs to satisfy Proposition [0 before the first round 
by guessing Zf, and then As 1 is equal to 0 as shown in Section 13. II In order 
to determine the right hand side of Equation (II 111 , we need to know Zf, Zf, 
Zf, Z 4 , Zg, Z 5 , Zf, Zf, Zf, Zf, Zf. These 12 subkeys take the bits 75-65 
from key Z. These 119 bits only cover the most significant nine bits of Zf , which 
determines the probability distribution of LSB(As 2 ). It is not necessary to guess 
the complete subkey Zf. The attack continues to guess the least significant 
bit of Zf(the 72nd bit of Z), and estimates the probability of LSB(As 2 ) = 1 
by Remark d instead. Hence, the attack divides the key space into 2 120 key- 
dependent subsets by the 120 key bits, and performs the individual attacks on 
each key-dependent subset. The attack uses statistical hypothesis test method 
to determine which subset the actual key is in. For the subkeys Zf of which 
Prob(LSB(As 2 ) = 1) can not be approximated by Remark das shown in Section 
d the attack exhaustively searches the remaining key bits. 

In this attack, it is possible that the expected time of individual attacks 
are larger than exhaustively search directly for some key-dependent subsets, 
which means 

2n + p-2 8 > 2® 

Under this condition, the attack also uses exhaustive key search to determine the 
remaining eight key bits to make sure the time needed not exceed exhaustive search. 
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This attack also choose a < 0.01 to ensure that the attack successes with 99% 
probability. In this case, the total expected time complexity is 2 125 " 5 encryptions. 

Our experiment shows that the attack needs at most 75 pairs of encryptions for 
one test. We ask for 2 17 encryptions which can provide 2 16 pairs of encryptions, 
which is sufficient for the test. This data complexity(2 17 ) is the least out of all 
the known attacks on the 5-round IDEA starting from the first round. 

In the second attack, we try to obtain the plaintext pairs satisfying Proposition 
Q] before the second round. In order to determine LSB(As 3 ), we need to know 
the least significant bits of As 1 , As 2 , As 4 and As 5 . Hence, the subkeys we need 
to know are Z\, Z\, Z\, Z\, Z\, Z\, Z\, Z\. Zf, Z|, Z\ and Z|. These 13 
subkeys only cover 107 bits of key Z(0-106). For every guessed 107 key bits, we 
use similar technique as before. The expected time complexity is 2 115 " 3 , which 
is the least time complexity out of all the known attacks on the 5-round IDEA 
starting from the first round. 

Because it is not possible to predict the plaintext pairs which produces the 
intermediate pairs satisfying Proposition ^ before the second round, the encryp- 
tions of all the 2 64 plaintexts are required. 

6 Conclusions 

In this paper, we formalized a scheme of identifying the actual key using the 
key-dependent distribution, called key-dependent attack. How to minimize the 
time complexity of the key-dependent attack was also discussed. With the key- 
dependent attack, we could improve known cryptanalysis results and obtain 
more powerful attacks. We presented two key-dependent attacks on IDEA. Our 
attack on 5.5-round and 6-round variant of IDEA has the least time and data 
complexities compared with the previous attacks. 

We only implemented a tentative exploration of the key-dependent distribution. 
How to make full use of the key-dependent distribution, especially how to use the 
key-dependent distribution to improve existing attacks, is worth further studying. 

The attack on IDEA makes use of the relation between XOR, modular ad- 
dition and modular multiplication. We believe that the operation XOR and 
modular multiplication have more properties that can be explored further m- 
Similar relations among other operations are also valuable to research. The way 
of making full use of the Biryukov-Demirci Equation to improve attacks on IDEA 
is also interesting. 
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Abstract. The security of cascade blockcipher encryption is an impor- 
tant and well-studied problem in theoretical cryptography with practical 
implications. It is well-known that double encryption improves the secu- 
rity only marginally, leaving triple encryption as the shortest reasonable 
cascade. In a recent paper, Bellare and Rogaway showed that in the 
ideal cipher model, triple encryption is significantly more secure than 
single and double encryption, stating the security of longer cascades as 
an open question. 

In this paper, we propose a new lemma on the indistinguishability of 
systems extending Maurer’s theory of random systems. In addition to 
being of independent interest, it allows us to compactly rephrase Bellare 
and Rogaway’s proof strategy in this framework, thus making the argu- 
ment more abstract and hence easy to follow. As a result, this allows 
us to address the security of longer cascades. Our result implies that 
for blockciphers with smaller key space than message space (e.g. DES), 
longer cascades improve the security of the encryption up to a certain 
limit. This partially answers the open question mentioned above. 

Keywords: cascade encryption, ideal cipher model, random system, 
indistinguishability. 


1 Introduction 

The cascade encryption is a simple and practical construction used to enlarge the 
key space of a blockcipher without the need to switch to a new algorithm. Instead 
of applying the blockcipher only once, it is applied l times with l independently 
chosen keys. A prominent and widely used example of this construction is the 
Triple DES encryption j'2ll .'ill 41 . 

Many results investigating the power of the cascade construction have been 
published. It is well-known that double encryption does not significantly improve 
the security over single encryption due to the meet-in-the-middle attack 0 • The 
marginal security gain achieved by double encryption was described in 0 . Even 
and Goldreich 0 show that a cascade of ciphers is at least as strong as the 
strongest of the ciphers against attacks that are restricted to operating on full 
blocks. In contrast, Maurer and Massey CH show that for the most general 
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attack model, where it is for example possible that an attacker might obtain 
only half the ciphertext block for a chosen message block, the cascade is only at 
least as strong as the first cipher of the cascade. 

In a recent paper 0 , Bellare and Rogaway have claimed a lower bound on the 
security of triple encryption in the ideal cipher model. Their bound implies that 
for a blockcipher with key length k and block length n, triple encryption is indis- 
tinguishable from a random permutation as long as the distinguisher is allowed 
to make not more than roughly 2 fc+ 2 nun{rt,fc} q Uer j es This bound is significantly 
higher than the known upper bound on the security of single and double encryp- 
tion, proving that triple encryption is the shortest cascade that provides a rea- 
sonable security improvement over single encryption. Since a longer cascade is at 
least as secure as a shorter one, their bound applies also to longer cascades. They 
formulate as an interesting open problem to determine whether the security im- 
proves with the length of the cascade also for lengths / > 3. However, the proof in 
PJ contains a few bugs, which we describe in the appendix of this paper. The first 
part of our contribution is to fix these errors and to reestablish the lower bound 
on the security of triple encryption up to a constant factor. 

Second, we have rephrased the proof into the random systems framework in- 
troduced in PH . Our goal here is to simplify the proof and express it on the most 
abstract level possible, thus making the main line of reasoning easy to follow and 
clearly separated from the two technical arguments required. To achieve this, we 
extend the random systems framework by a new lemma. This lemma is a general- 
ization of both Lemma 7 from m and hence also of its special case for the game- 
playing scenario, the Fundamental lemma of game-playing. This was introduced 
in ^ and subsequently used as an important tool in the game-playing proofs (see 
for example f 1 51315] 1 . We illustrate the use of this new lemma in our proof of the 
security of cascade encryption. Apart from the simplification, this also gives us 
an improvement of the result by a constant factor. 

Finally, our reformulation makes it natural to consider also the security of 
longer cascades. The lower bound we prove improves with the length of the cas- 
cade l for all blockciphers where k < n and for moderate values of l. With increas- 
ing cascade length, the bound approaches very roughly the value 2 fc + min f"/ 2 > fc } 
(the exact formula can be found in Theorem [Q). The condition k < n is satisfied 
for example for the DES blockcipher, where the length of the key is 56 bits and 
the length of one block is 64 bits. For these parameters, the result from j2j that 
we reestablish proves that the triple encryption is secure up to 2 78 queries, but 
our result shows that a cascade of length 5 is secure up to 2 83 queries. The larger 
the difference n — k, the more a longer cascade can help. This partially answers 
the open question from |1] . 

2 Preliminaries 

2.1 Basic Notation 

Throughout the paper, we denote sets by calligraphic letters (e.g. S). For a fi- 
nite set S, we denote by 5 the number of its elements. A fc-tuple is denoted as 
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u k = (ui, . . . , Uk), and the set of all fc-tuples of elements of U is denoted as U k . 
The composition of mappings is interpreted from left to right, i.e., fog denotes 
the mapping £/(/(•)). The set of all permutations of {0, 1}" is denoted by Perm(n) 
and id represents the identity mapping, if the domain is implicitly given. The no- 
tation x— represents the falling factorial power, i.e. , x— = x(x — 1) ■ ■ • (x — n + 1) . 
The symbol p co u (n, k) denotes the probability that k independent random vari- 
ables with uniform distribution over a set of size n contain a collision, i.e., that 
they are not all distinct. It is well-known that p co ii{n. k) < k 2 /2n. By CS(-) 
we shall denote the set of all cyclic shifts of a given tuple, in other words, 
CS(7ri,7T 2 , ■ • • ,7T r )= {(7Tl, 7T 2 , • • • , 7T r ) , (tT 2 , 7T 3 , . . . ,W r ,7Tl), . . . , (7T r , TTi, . . . , 7T r _l)}. 

We usually denote random variables and concrete values they can take on by 
capital and small letters, respectively. For events A and B and random variables 
U and V with ranges U and V, respectively, we denote by Pua\vb the corre- 
sponding conditional probability distribution, seen as a function U x V — > (0,1). 
Here the value Pua\vb{u,v) is well-defined for all u G U and v G V such that 
Pvn(v) > 0 and undefined otherwise. Two probability distributions P u and 
P u> on the same set U are equal, denoted Pry = Pc/', if P //('«) = Pry' ('«) for 
all u e U. Conditional probability distributions are equal if the equality holds 
for all arguments for which both of them are defined. To emphasize the ran- 
dom experiment E in consideration, we sometimes write it in the superscript, 
e.g. Py\ v {u, v). The expected value of the random variable X is denoted by 
• P[X = a;]). The complement of an event A is denoted by A. 

2.2 Random Systems 

In this subsection, we present the basic notions of the random systems frame- 
work, as introduced in m, along with some new extensions of the framework. 
The input-output behavior of any discrete system can be described by a random 
system in the spirit of the following definition. 

Definition 1. An ( X,y)-random system F is a (generally infinite) sequence of 
conditional probability distributions Py, xiy4 - l for all i > 1. 

The behavior of the random system is specified by the sequence of conditional 
probabilities Py i |jf<y‘-i(2/*i x% ■> 2/* _1 ) (f° r * > 1) of obtaining the output j/,; G y 
on query Xi e X given the previous i — 1 queries a;* -1 = (aq, . . . , Xi- 1 ) G T* _1 
and their corresponding outputs y l ~ l = (j/i, . . . , j/j-i) 6 y i ~ 1 . A random system 
can also be defined by a sequence of conditional probability distributions Pygx* 
for i > 1. This description is often convenient, but is not minimal. 

We shall use boldface letters (e.g. F) to denote both a discrete system and 
a random system corresponding to it. This should cause no confusion. We em- 
phasize that although the results of this paper are stated for random systems, 
they hold for arbitrary systems, since the only property of a system that is rel- 
evant here is its input-output behavior. It is reasonable to consider two discrete 
systems equivalent if their input-output behaviors are the same, even if their 
internal structure differs. 
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Definition 2. Two systems F and G are equivalent, denoted F = G , if they 
correspond to the same random system, i.e., if P = Pyqx'F’- 1 f or 
all i > 1. 

We shall usually define a system (and hence also the corresponding random sys- 
tem) by a description of its internal working, as long as the transition to the 
probability distributions is straightforward. Examples of random systems that 
we consider in the following are the uniform random permutation P : {0, 1}" — * 
{0, 1}", which realizes a function randomly chosen from Perm(n); and the ideal 
blockcipher E : {0, l} k x {0, 1}" — > {0, 1}", which realizes an independent uni- 
formly random permutation for each key K G {0, l} k . In this paper we assume 
that both P and E can be queried in both directions. 

We can define a distinguisher D for an (X, (y)-random system as a (y. X')- 
random system which is one query ahead, i.e., it is defined by the conditional 
probability distributions P°|xi-iyi-i f° r all * > 1. In particular, the first query 
of D is determined by Py 1 . After a certain number of queries (say q), the distin- 
guisher outputs a bit W q depending on the transcript (X q ,Y q ). For a random 
system F and a distinguisher D, let DF be the random experiment where D 
interacts with F. Then for two (X, ^-random systems F and G, the distinguish- 
ing advantage of D in distinguishing systems F and G by q queries is defined as 


Z\°(F,G) = |P DF (W g = 1) - P dg (W 9 = 1)|. We are usually interested in the 


maximal distinguishing advantage over all such distinguishers, which we denote 
by A q ( F, G) = max D A°(F, G). 

For a random system F, we often consider an internal monotone condition 
defined on it. Such a condition is initially satisfied (true), but once it gets vi- 
olated, it cannot become true again. We characterize such a condition by a se- 
quence of events A = Ao, Ai, . . . such that Ao always holds, and A, ; holds if the 
condition holds after query i. The probability that a distinguisher D issuing q 
queries makes a monotone condition A fail in the random experiment DF is 
denoted by u D (F, A q ) = P DF (A 9 ) and we are again interested in the maximum 
over all distinguishers, denoted by u(F ,A q ) = max D v D (F, A q ). For a random 
system F with a monotone condition A = Ao , A\ , . . . and a random system 
G, we say that F conditioned on A is equivalent to G, denoted F|A = G, if 
P F . xi yi i Ai = P G |yiy,;-i for i > 1, for all arguments for which P^ i \ X i Y i - l A i 
defined. The following claim was proved in m- 

Lemma 1. 7/F|A = G then A,(F,G) < i/(F,A^). 

Let F be a random system with a monotone condition A. Following [El, we 
define F blocked by A to be a new random system that behaves exactly like F 
while the condition A is satisfied. Once A is violated, it only outputs a special 
blocking symbol T not contained in the output alphabet of F. More formally, 
the following mapping is applied to the 'i th output of F: 



if A, holds 
otherwise. 
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The following new lemma relates the optimal advantage in distinguishing two 
random systems to the optimal advantage in distinguishing their blocked 
counterparts. 

Lemma 2. Let F and G be two random systems with monotone conditions A 
and B defined on them, respectively. Let F- 1 - denote the random system F blocked 
by A and let G -1 denote G blocked by B. Then for every distinguisher D we have 
A G (F,G)<A q (F ± ,G ± ) + iy D (F,A^). 

Proof. Let D be an arbitrary distinguisher for F and G. Let D' be a distinguisher 
that works as follows: it simulates D, but whenever it receives an answer _L to its 
query, it aborts and outputs 1. Then we have P DG [W g = 1] < P D G [W q = 1] 
and P°' F± [W q = 1] < P df [W 9 = 1] + i/ D (F ,A~ q ). 

First, let us assume that P DG [W 9 = 1] > P DF [W g = 1]. Then, using the 
definition of advantage and the above inequalities, we get 

Z\ g (F,G) = |P DG [W, = 1] - P DF [W, = 1]| 

= P GG [W q = 1] - P DF [W ? = 1] 

< P D ' G± [W q = 1] - (P D ' F " [W q = 1] - ^ d (F, ~A q )) 
<A q {F ± ,G ± ) + u D (F,'J^, 

which proves the lemma in this case. On the other hand, if P DG [Wg = 1] < 
P DF [W g = 1], we can easily construct another distinguisher D* with the same 
behavior as D and the opposite final answer bit. Then we can proceed with 
the argument as before and since A g (F,G) = A G ’ (F, G) and n D (F. A q ) = 
;y D (F, A q ), the conclusion is valid also for the distinguisher D. □ 

Lemma 0 is a generalization of both Lemma 7 from and of its special case, 
the Fundamental lemma of game-playing from 0 . Both these lemmas describe 
the special case when Z\ g (F- L ,G J -) = 0, i.e., when the distinguished systems 
behave identically until some conditions are violated. Our lemma is useful in 
the situations where the systems are not identical even while the conditions are 
satisfied, but their behavior is very similar. A good example of such a situation 
is presented in the proof of Theorem 0 

A random system F can be used as a component of a larger system: in par- 
ticular, we shall consider constructions C(-) such that the resulting random 
system C(F) invokes F as a subsystem. We state the following two observations 
about the composition of systems. 

Lemma 3. Let C(-) and C^-) be two constructions invoking an internal random 
system, and let F and G be random systems. Then 

(i) A,(C(F),C(G)) < /V(F,G), where q ’ is the maximum number of invo- 
cations of any internal system H for any sequence of q queries to C(H), 
if such a value is defined. 

(ii) There exists a fixed permutation S G Perm(n) (represented by a determin- 
istic stateless system) such that A g (C(P), C'(P)) < A q (C(S), C'(S)). 
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Proof. The first claim comes from urn, so here we only prove the second one. 
Since the random system P can be seen as a system that picks a permutation 
uniformly at random from Perm(n) and then realizes this permutation, we have: 

A*(C(p),c'(p))<- 1- £ ^(c(S),c'(S)). 

^ '' Se Perm(n) 

If all the values A q (C(S),C'(S)) were smaller than Z\ 9 (C(P), C'(P)) it would 
contradict the inequality above, hence there exists a permutation S e Perm(n) 
such that A g (C(P), C'(P)) < A q (C(S),C'(S)). □ 

2.3 Ideal Blockciphers and Chains 

We introduce some specific notions related to the cascade encryption setting. 
Our terminology follows and extends that in Q. 

A blockcipher with keyspace {0, l} fc and message space {0, 1}” is a mapping 
E : {0, l} fc x {0,1}" -»• {0,1}" such that for each K e {0, l} k , E(K, •) is a 
permutation on the set {0,1}". Typically Ek{x) is written instead of E(K,x ) 
and Ef}(-) refers to the inverse of the permutation E K (-). 

Throughout the paper, we shall work in the ideal blockcipher model, which was 
recently shown to be equivalent to the random oracle model jS] . The ideal block- 
cipher model is widely used to analyze blockcipher constructions (e.g. jH HOI j 
and consists of the assumption that for each key, the blockcipher realizes an 
independent random permutation. 

A blockcipher can be seen as a directed graph consisting of 2" vertices repre- 
senting the message space and 2" +fc edges. Each vertex x has 2 k outgoing edges 
pointing to the encryptions of the message x using all possible keys. Each of the 
edges is labeled by the respective key. For a fixed blockcipher E, we denote bM 

w(E) = max | {K \ E K (x) = y}\ 

the maximal number of distinct keys mapping the plaintext x onto the ciphertext 
y, the maximum taken over all pairs of blocks ( x,y ). Intuitively, w{E) is the 
weight of the heaviest edge in the graph corresponding to E. This also naturally 
defines a random variable u(E) for the random system E realizing the ideal 
blockcipher. 

If a distinguisher makes queries to a blockcipher E, let x —> y denote the fact 
that it either made a query Ek(x) and received the encryption y or made a query 
Eff(y) and received the decryption x. An r-chain for keys {K \ . . . . , K r ) is an 
(r + l)-tuple (zo, Ki , . . . , K r ) for which there exist x ±, . . . , x r such that xo — > 
Xi —*■■■—> x r holds. Similarly, if a fixed permutation S is given and 1 < § < r, 
then an i-disconnectedr -chain for keys {K \, . . . , K r ) with respect to S is an (r+1)- 
tuple (ro. Ax, ... , K r ) for which there exist x\,...,x r such that we have both 

1 w(E) was denoted as Keys B in |3|. 
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K r -i+i K r - i + * K r , . K-l K 2 K r - t 

xq — > xi — > • • • — > Xi and a (Xi) — > aq+i — > • • • — > x r . When de- 

scribing chains, we sometimes explicitly refer to the permutations instead of the 
keys that define them. For disconnected chains, we sometimes omit the reference 
to the permutation S if it is clear from the context. The purpose of the following 
definition will be clear from the proof of Theorem [I] 

Definition 3. Let S be a fixed permutation. A distinguisher examines the key 
tuple (Ki,K 2 , • • • , K r ) w.r.t. S if it creates either an r-chain or an i- disconnected 
r-chain w.r.t. S for (K\,K 2 , ■ ■ ■ ,K r ) for any i 6 {1, ,r — 1}. 

3 The Security of Cascade Encryption 

In this section we reestablish the lower bound on the security of triple encryption 
from ^ in a more general setting. Our goal here is to simplify the proof and 
make it more comprehensible thanks to the level of abstraction provided by the 
random systems framework. Using Lemma El wc also gain an improvement by 
a constant factor of 2 (cf. equation (10) in gj). However, in order to fix the 
problem of the proof in dJ , a new factor l appears in the security bound. 

Although Theorem [0 only explicitly states the security of cascades with odd 
length, we point out that a simple reduction argument proves that longer cas- 
cades cannot be less secure than shorter ones, except for a negligible term l/2 k . 
Therefore, our result also implicitly proves any even cascade to be at least as 
secure as a one step shorter odd-length cascade. 

We also point out that our bound is only useful for cascades of reasonable 
length, for extremely long cascades (e.g. I « 2 fc / 2 ) it becomes trivial. 

3.1 Proof of the Main Result 

Since this subsection aims to address the overall structure of the proof, we shall 
use two technical lemmas without proof (Lemmas 0] and EJ). These lemmas cor- 
respond to Lemmas 7 and 9 from P] , which they improve and generalize. We 
shall prove them in later subsections. 

Let l > 3 be an odd integer. Let Ci(-, •) denote a construction which expects 
two subsystems: a blockcipher E and a permutation P. It chooses in advance l 
uniformly distinct keys K\, . . . , Ki- These are not used by the system, their pur- 
pose is to make Ci(-, •) comparable to the other constructions. Ci(-, •) provides 
an interface to make forward and backward queries both to the blockcipher E 
and to the permutation P. 

On the other hand, let C 2 O) denote a construction which expects a blockci- 
pher E as the only subsystem. It chooses in advance l uniformly random keys 
K 1 , . . . ,Ki. It provides an interface to make forward and backward queries both 
to the blockcipher E and to a permutation P, which it realizes as E 0 • ■ • o E Kl . 
To achieve this, C 2 O) queries its subsystem for all necessary values. Let C^-) 
be the same construction as C 2 (-) except that it chooses the keys K l: . . . ,K[ to 
be uniformly distinct. 
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Finally, let C 3 (-,-) denote a construction which again expects two subsys- 
tems: a blockcipher E and a permutation P. It chooses in advance l uniformly 
distinct keys Ki, . . . ,Ki. It provides an interface to make forward and back- 
ward queries both to the blockcipher E and to the permutation P. However, 
answers to the blockcipher queries involving the key Ki are modified to satisfy 
the equation E Kl a ■■■ a E Kl = P. More precisely, forward queries are real- 
ized as E k , (a;) = P(E ^ (■ • • Ef}^ (»)■••)) and backward queries are realized as 
Ex](y) = Ek^^Ek^^- ■ ■ E Kl (P~ 1 (y)) •••))• To achieve this, C 3 (v) queries 
its subsystems for all necessary values. 

Recall that P and E denote the uniform random permutation and the ideal 
blockcipher, respectively. The following theorem bounds A q (Ci(E, P), 02 (E)), 
the advantage in distinguishing cascade encryption of length l from a random 
permutation, given access to the underlying blockcipher. 

Theorem 1. For the constructions Ci(-,-), C 2 (-) and random systems E, P 
defined as above we have 


4,(C,(E,P),C 2 (E)) < 21„W 2 J|T + 1.9 ^ 


P 

2 k + 1 ’ 


where a = max{2e2 fc ", 2n + k[l/2\}. 

Proof. First, it is easy to see that Aj(C 2 (E), C^E)) < p co ii(2 k ,l ) < l 2 / 2 k+1 
and hence we have A?(Ci(E, P), C 2 (E)) < Z\ 9 (Ci(E,P),C^(E)) + Z 2 /2 fc+1 . 
However, note that Cj (E) = C 3 (E, Pj: this is because in both systems the 
permutations Ek 1 , . . . , Ek, , P are chosen randomly with the only restriction 
that Ek 1 o • • • o Ek, = P is satisfied. Now we can use Lemma El to substitute the 
random permutation P in both Ci(E, P) and C 3 (E,P) for a fixed one. Let S 
denote the permutation guaranteed by Lemma 01 Then we have 

A r (C 1 (E,P),Cg(E)) = ^(C 1 (E,P),C 3 (E,P)) < ^(C 1 (E,S),C 3 (E,S)). 

Since the permutation S is fixed, it makes now no sense for the distinguisher to 
query this permutation; it can have the permutation S hardwired. 

From now on, we shall denote all queries to a blockcipher that involve one of 
the keys K , , K%, . . . , JQ as relevant queries. Let us now consider a monotone 
condition A h (h e N is a parameter) defined on the random system Ch (E, S). 
The condition A q is satisfied if the keys (Kj, K 2 , ■ ■ ■ , K,) were not examined 
w.r.t. S (in the sense of Definition EJ) by the first q queries and at most h of these q 
queries were relevant. Let B h be an analogous condition defined on C 3 (E, S): B k 
is satisfied if the first q queries did not form a chain for the tuple (K -, , K- 2 - . . . , K,) 
and at most h of these queries were relevant. Let G and H denote the random 
systems Ci(E, S) and C 3 (E, S) blocked by A h and B h , respectively. Then by 
Lemma El 


A,(C!(E, S), C 3 (E, S)) < A q {G, H) + i/(C!(E, S),A k ). 
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Let us first bound the quantity i/(Ci(E, .S'), A k ). We can write A k as U q A V q , 
where U q is satisfied if the first q queries did not examine the tuple of keys 
(Ki, K2, ■ ■ ■ , K{) and V q is satisfied if at most h of the first q queries were 
relevant. Since A k 44- U q V V q h , the union bound gives us 

KC!(E ,5), If) < I /(C 1 (E,S'),F^+KCi(E,S'),t^). 

We prove in Lemma 0 that i/(Ci(E, S), U q ) < 2 la\- l / 2 iq^ 1 / 2 ^ /( 2 k ) 1 -. Since the 
keys Ki,. . ,,Ki do not affect the outputs of Ci(E, S), adaptivity does not help 
when trying to violate the condition V q h , therefore we can restrict our analysis to 
nonadaptive strategies for provoking V q . The probability that a given query is 
relevant is l/ 2 k , hence the expected number of relevant queries among the first q 
queries is lq/ 2 k and by Markov’s inequality we have z/(Ci (E, S), V q h ) < lq/h 2 k . 
All put together, i/(Ci(E, S), A%) < 2 la^ q^ / ( 2 k )^ + lq/h 2 k . 

It remains to bound A q (G, H). These systems only differ in their behavior 
for the first h relevant queries, so let us make this difference explicit. Let G r 
be a random system that allows queries to / independent random permutations 
7Ti, 7T2, . ■ . , 717, but returns T once the queries create an /-chain for any tuple in 
CS(7ri,7T2, . . . ,7 17). Let H r be a random system that allows queries to / random 
permutations tti, 7% • • ■ , 717 such that 7Ti o -k 2 0 ■ ■ • ° 717 = id, but returns T once 
the queries create an /-chain for the tuple (7Ti, 772, . . . , 7 17). Let Gf h s(;) be a con- 
struction that allows queries to a blockcipher, let us denote it by E. In advance, 
it picks / random distinct keys Ki,K 2,- ■ ■ ,Ki. Then it realizes the queries to 
EkiiEkw ■ ■ ,E Kl as 7Ti, 7T2,. . . ,7T;_i and tv 1 o S respectively, where the permuta- 
tions 7r,; for i e {I.... , /} are provided by a subsystem. Ek for all other keys K 
are realized by C h,s{ m ) as random permutations. However, C h,s(') on ly redirects 
the first h relevant queries to the subsystem, after this number is exceeded, it 
responds to all queries by _L. Intuitively, the subsystem used is responsible for 
the answers to the first h relevant queries (hence the subscript ”r”). Since the 
disconnected chains in Ch,s(G T ) correspond exactly to the ordinary chains in 
G r , we have C^ j s(G r ) = G and C/ li g(H r ) = H. According to Lemma 0 and 
Lemma 0 below, we have A 9 (G,H) < Aft(G r ,H r ) < h 2 / 2”. 

Now we can optimize the choice of the constant h. The part of the advan- 
tage that depends on h is f(h) = lq/h 2 k + h 2 / 2 n . This term is minimal for 

h* = (lq 2 n ~ k ~ 1 ) 1 ' 3 and we get f(h*) < 1.9 ( 2k +„/ 2 ) ■ This completes the 

proof. □ 


3.2 Examining the Relevant Keys 

Here we analyze the probability that the adversary examines the relevant keys 
(K 1, . . . , K{) w.r.t. S during its interaction with the random system Ci(E, S). 
This is a generalization of Lemma 7 from 0 to longer cascades, also taking 
disconnected chains into account. 
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Lemma 4. Let the random system Ci(E, S ) and the condition U q be defined as 
in the proof of Theorem 0 with the number of keys l being odd. Then we have 
2 z(Ci(E, S),Ug) < 2lot\- l WqVW /{2 k ) 1 -, where a = max{2e2 fe -”, 2n + k\l/2 \ }. 

Proof. Recall that the relevant keys K\ , . . . ,Ki are examined by the distin- 
guisher if it creates either an /-chain or an /-disconnected /-chain for the tuple 
(K u K 2 , . . . , Ki) for any % G {1, . . . , / - 1}. 

Let i G {1,...,/ — 1} be fixed. We first bound the probability that the distin- 
guisher creates an /-disconnected /-chain. Since the relevant keys do not affect 
the behavior of the system Ci(E, S), this probability is equal to the number of 
/-tuples of distinct keys for which an /-disconnected /-chain was created, divided 
by the number of all /-tuples of distinct keys, which is ( 2 k )~ . The numerator can 
be upper bounded by the number of all /-disconnected /-chains that were created 
(here we also count those created for non-distinct key tuples). Hence, let Ch® q 
denote the maximum number of /-disconnected /-chains any distinguisher can 
create by issuing q queries to a fixed blockcipher E and let Ch® denote the 
expected value of Ch^ ; q with respect to the choice of E by E. 

Let G be a directed graph corresponding to a blockcipher E, as described in 
Subsection 12,. SI Let H be the spanning subgraph of G containing only the edges 
that were queried by the distinguisher. Any /-disconnected /-chain consists of 
/ edges in H, let us denote them as ei, e 2 , ■ ■ ■ , e*, following the order in which 
they appear in the chain. Then for each of the odd edges ei,e 3 ,...,ej there 
are q possibilities to choose which of the queries corresponds to this edge. Once 
the odd edges are fixed, they uniquely determine the vertices xq, x \, . . . , xi such 
that ej is Xj-i — ► Xj for j G {1,3, ...,/} \ {/ + 1} and e^+i is S~ x (xi) — ► Xi + 1 
if / is even. Since there are at most w(E) possible edges to connect any pair of 
vertices in G, there are now at most w(E) possibilities to choose each of the even 
edges e 2 , ei,..., e;_i so that ej is Xj - 1 — > Xj for j G {2, 4, . . . , / — 1} \ {/ + 1} 
and ej + i is S~ 1 (xi) —> x i+ i if / is odd. Hence, Ch^ 9 < w(E)^ l Gi ql 1 / 2 ] an d 
Ch®, g <w(E)L'/2J g R/2l. 

It remains to bound the value w(E). For this, we use the bound from 0, 
where the inequality P[mj(E) > /3] < 2 2rl+1_/3 is proved for any /3 > 2e2 k ~ n . 
Using this inequality gives us 

Chf M < E[Ch f M | w(E) < a] + E[Chf M \ w(E) > a) • 2 2 " +1 ““ 

< a [iM q \i/2] +2 k[i/2 iq \i^ 2 2n+i- a < 2aVWqW 2 \ 

where the last two inequalities hold since w(E) < 2 k and a> 2n + fc[//2j > 2. 

Putting all together, we get that the probability of forming an /-disconnected Z- 
chain for the keys (Ki, K 2 , ... , K{) can be upper bounded by 2 aLV 2 J gR/21 J (2 k )-. 
Since this holds for each / G {1,2, ...,/ — 1} and the probability of creating an 
/-chain for the keys (ATi, . . . , Kf) can be bounded in the same way, by the union 
bound we get i/(Ci(E, S),Uq) < 2la^ 2 iq^ /(2 k )K □ 
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3.3 Distinguishing Independent and Correlated Permutations 

Now we shall improve the bound on AffiG r , H r ) stated by Lemma 9 in 
Using the concept of conditional equivalence from m, our result is better by a 
constant factor and is applicable for the general case of /-cascade encryption. 

Recall that G r is a random system that provides an interface to query / 
random independent permutation^ 7q ..... ?q in both directions. However, if 
the queries of the distinguisher form an /-chain for any tuple of permutations 
in CS(7Ti, . . . , 7rj), the system G r becomes blocked and answers all subsequent 
queries (including the one that formed the chain) with the symbol _L. On the 
other hand, H r is a random system that provides an interface to query / random 
permutations sq, . . . , 7 q such that 7Ti o • ■ ■ o m = id, again in both directions. 
Similarly, if an /-chain is created for any tuple in CS(7Ti, . . . , 7 q) (which is in this 
case equivalent to creating an /-chain for (tti, . . . ,7p)), H r answers all subse- 
quent queries with the symbol _L. Therefore, the value ZU.(G r , H r ) denotes the 
best possible advantage in distinguishing / independent random permutations 
from / random permutations correlated in the described way, without forming 
an /-chain. 

Lemma 5. Let G r and H r be the random systems defined in the proof of The- 
oremU 1 Then Zi h (G r ,H r ) < h 2 3 / 2". 

Proof. First, let us introduce some notation. In any experiment where the per- 
mutations 7Ti, . . . , 7p are queried, let domj(7q) denote the set of all x G {0, 1}" 
such that among the first j queries, the query n l (x) was already answered or 
some query Trfi 1 (y) was answered by x. Similarly, let range^Tq) be the set of all 
y G {0,1}" such that among the first j queries, the query n~ 1 (y) was already 
answered or some query it fix) was answered by y. In other words, domj(7q) and 
range ? (7q) denote the domain and range of the partial function m defined by 
the first j answers. For each pair of consecutive permutation^ 7q and 7q+j, let 
x\^ denote the set {0, 1}"\ (range^Tq) Udorrij(7q + i)) of fresh, unused values. If 
x y then we call the queries nfix) and nfi [ (y) trivial and the queries 7R+i(2/) 
and 7 rTj (a;) are said to extend a chain if they are not trivial too. 

Now we introduce an intermediate random system S and show how both G r 
and H r are conditionally equivalent to S. This allows us to use Lemma 0 to 
bound the advantage in distinguishing G r and H r . The system S also provides 
an interface to query / permutations 7q , . . . , 7q . It works as follows: it answers any 
non-trivial forward query 7q (a;) with a value chosen uniformly from the set x]f 1 ' ) 
and any non-trivial backward query irfi^{x) with a value chosen uniformly from 
the set (assuming it is the j th query). Any trivial queries are answered 

consistently with previous answers. Moreover, if the queries form an /-chain for 
any tuple in CS(7Ti, . . . , tr), S also gets blocked and responds with T to any 
further queries. Note that S is only defined as long as \Xp~^\ > 0, but if this 
is not true, we have h> 2" and the lemma holds trivially. 

2 All permutations considered here are defined on the set {0, 1}". 

3 The indexing of permutations is cyclic, e.g. 7r/ + i denotes the permutation 7n. 


48 


P. Gazi and U. Maurer 


Let us now consider the j th query that does not extend an (/ — l)-chain 
(otherwise both G r and S get blocked). Then the system G r answers any non- 
trivial forward query TTi(x) by a random element uniformly chosen from {0, 1}"\ 
range J _ 1 (7Tj) or gets blocked if this answer would create an /-chain by connecting 
two shorter chains. On the other hand, the system S answers with a random el- 
ement uniformly chosen from X- : ' which is a subset of {0, 1 }” \ range ? _ : (wi). 

The situation for backward queries is analogous. Therefore, let us define a mono- 
tone condition /C on G r : the event Kj is satisfied if Kj_ i was satisfied and the 
answer to the j th query was picked from the set X^ 1 * if it was a non-trivial 
forward query TTi(x) or from the set X ^_ 1 1 ' ) if it was a non-trivial backward query 
7 rP 1 (y). Note that as long as K, is satisfied, no /-chain can emerge by connecting 
two shorter chains. By the previous observations and the definition of /C, we have 
G r |/C = S which by Lemma 0 implies A h (G r . S) < v(G T ,K h ). The probability 
that K is violated by the j th answer is 

|dom 7 - t(7r i+ i) \rangCj- , (tt*)! ^ |{0, 1}” \ A^ _1) | „ j - 1 
|{0,l}«\range j _ 1 ( 7 r i )| “ |{0,1}»| “ 2 ™ ’ 

which gives us v(G T ,K h ) < J2j=i U ~ l)/ 2 " < h 2 / 2 n+1 . 

In the system H r , the permutations 7Ti,.. . ,7Tj can be seen as 2" cycles of 
length /, each of which is formed by the edges connecting the vertices x, 7n (x) , . . . , 
7r;_i (• • • 7ri(x) ■ ■ ■ ), x for some x G {0, 1}" and labeled by the respective permu- 
tations. We shall call such a cycle used if at least one of its edges was queried 
in either directioifl, otherwise we call it unused. Let us now define a monotone 
condition £ on H r : the event Lj is satisfied if during the first j queries, any non- 
trivial query which did not extend an existing chain queried an unused cycle. 

We claim that H r |£ = S. To see this, let us consider all possible types of 
queries. If the j th query 7r,(a:;) is trivial or it extends an (/ — l)-chain, both sys- 
tems behave identically. Otherwise, the system H r answers with a value y, where 
y £ range J _ 1 (7 q) (because 7q is a permutation) and y £ dom ? _ i (7q + i), since that 
would mean that £ was violated either earlier (if this query extends an existing 
chain) or now (if it starts a new chain). All values from X j 2 1 ' 1 have the same 
probability of being y, because for any yi,y 2 € X^~ l \ there exists a straightfor- 
ward bijective mapping between the arrangement of the cycles consistent with 
TTi(x) = yi or TTi(x) = 3/2 (and all previous answers). Therefore, H, answers with 
an uniformly chosen element from and so does S. For backward queries, 

the situation is analogous. By Lemmas t his gives us Zb,(S, H r ) < j/(H r , L^). 

Let the j th query be a non-trivial forward query TTi(x) that does not extend a 
chain, i.e., x € X^ 1 \ Let u denote the number of elements in xj^ 1 ^ that are 
in a used cycle on the position between 7Tj_i and 7r,. Then since every element 
in X^. 1 ^ has the same probability of having this property (for the same reason 
as above), this query violates the condition £ with probability w/| < 

4 We consider a separate edge connecting two vertices for each cycle in which they 
follow each other, hence each query creates at most one used cycle. 
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{« + |range j _ 1 (ff i _i)Udom i _i(?ri)|)/2 n < (j - l)/2 n . Hence u(U r ,L h ) < Y!j=i 
U - l)/2" < h*/2»+\ 

Putting everything together, we have A h (G ri H r ) < A h (G T , S) + Z\/ l (S,H r ) < 
h 2 /2 n , which completes the proof. □ 


4 Conclusions 

In this paper, we have studied the security of the cascade encryption. The most 
important recent result on this topic HJ contained a few mistakes, which we 
pointed out and corrected. We have formulated the proof from in the random 
systems framework, which allows us to describe it on a more abstract level and 
thus in a more compact argument. This abstraction leads to a minor improve- 
ment for the case of triple encryption, as well as a generalization for the case of 
longer cascades. We prove that for the wide class of blockciphers with smaller 
key space than message space, a reasonable increase in the length of the cascade 
improves the encryption security. Our intention here was also to demonstrate 
the power of the random systems framework as a tool for modelling the behav- 
ior and interactions of discrete systems, with a focus towards analyzing their 
indistinguishability. 


Acknowledgements. We would like to thank the anonymous reviewers for 
useful comments. This research was partially supported by the Swiss National 
Science Foundation (SNF) project no. 200020-113700/1 and by the grants VEGA 
1/0266/09 and UK/385/2009. 

References 

1. Aiello, W., Bellare, M., Di Crescenzo, G., Venkatesan, R.: Security Amplification 
by Composition: The case of Doubly- Iterated, Ideal Ciphers. In: Krawczyk, H. (ed.) 
CRYPTO 1998. LNCS, vol. 1462, pp. 499-558. Springer, Heidelberg (1998) 

2. ANSI X9.52, Triple Data Encryption Algorithm Modes of Operation (1998) 

3. Bellare, M., Namprempre, Ch.: Authenticated Encryption: Relations among No- 
tions and Analysis of the Generic Composition Paradigm, full version, Cryptology 
ePrint Archive, Report 2000/025 (2007) 

4. Bellare, M., Rogaway, P.: Code-Based Game-Playing Proofs and the Security of 
Triple Encryption. In: Eurocrypt 2006. LNCS, vol. 4004, pp. 409-426. Springer, 
Heidelberg (2006), http://eprint.iacr.org/2004/331 

5. Bellare, M., Ristenpart, T.: Hash Functions in the Dedicated-Key Setting: Design 
Choices and MPP Transforms. In: Arge, L., Cachin, C., Jurdzinski, T., Tarlecki, 
A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 399-410. Springer, Heidelberg (2007) 

6. Coron, J.S., Patarin, J., Seurin, Y.: The Random Oracle Model and the Ideal Ci- 
pher Model are Equivalent. In: Wagner, D. (ed.) CRYPTO 2008. LNCS, vol. 5157, 
pp. 1-20. Springer, Heidelberg (2008) 

7. Diffie, W., Heilman, M.: Exhaustive Cryptanalysis of the Data Encryption Stan- 
dard. Computer 10, 74-84 (1977) 


50 


P. Gazi and U. Maurer 


8. Even, S., Goldreich, O.: On the Power of Cascade Ciphers. ACM Transactions on 
Computer Systems 3(2), 108-116 (1985) 

9. Even, S., Mansour, Y.: A Construction of a Cipher from a Pseudorandom Permu- 
tation. In: Matsumoto, T., Imai, H., Rivest, R.L. (eds.) ASIACRYPT 1991. LNCS, 
vol. 739, pp. 210-224. Springer, Heidelberg (1993) 

10. Maurer, U.: Indistinguishability of Random Systems. In: Knudsen, L.R. (ed.) 
EUROCRYPT 2002. LNCS, vol. 2332, pp. 110-132. Springer, Heidelberg (2002) 

11. Maurer, U., Massey, J.: Cascade Ciphers: the Importance of Being First. J. of 
Cryptology 6(1), 55-61 (1993) 

12. Maurer, U., Pietrzak, K., Renner, R.: Indistinguishability Amplification. In: 
Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 130-149. Springer, 
Heidelberg (2007) 

13. National Institute of Standards and Technology: FIPS PUB 46-3: Data Encryption 
Standard (DES) (1999) 

14. National Institute of Standards and Technology: Recommendation for the Triple 
Data Encryption Algorithm (TDEA) Block Cipher, NIST Special Publication 800- 
67 (2004) 

15. Rogaway, P., Shrimpton, T.: Deterministic Autenticated- Encryption. In: 
Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 373-390. Springer, 
Heidelberg (2006) 


A Problems with the Proof in J1J 

The proof of a lower bound for the security of triple encryption presented in 0 
contains some errors. We describe briefly where these errors come from, assuming 
the reader is familiar with the terminology and the proof from 0 . We shall be re- 
ferring to the version 2.3 of the paper published at the online ePrint archive. The 
proof eventually comes down to bounding the advantage in distinguishing inde- 
pendent random permutations 7To , 7Ti , 7T2 from random permutations 7To,7Ti,7r2 
such that 7To o 7Ti o 7t2 = id (distinguishing games G and H). This can be done 
easily if the distinguisher is allowed to extend a 2-chain by his queries, therefore 
the adversary is not allowed to do that in games G and H. To justify this, before 
proceeding to this part of the proof, the authors have to argue in a more complex 
setting (games Ds and /f 3 ) that the probability of extending a 2-chain for the 
relevant keys is negligible. However, due to the construction of the adversary 
Bs.b from the adversary B, extending a 2-chain by Bs,h in the experiment H Bs < h 
does not correspond to extending a 2-chain by B in D B , but to something we 
call a disconnected chain. The same can be said about the experiments R B and 
G Bs ’ b . Therefore, by bounding the probability of extending a 2-chain for the 
relevant keys in the experiment R B , the authors do not bound the probability 
of extending a 2-chain in the experiment G Bs ’ b , which they later need. 

The second problem of the proof in 0 lies in bounding the probability of cre- 
ating a chain using the game L. This is done by the equation P[R B sets x2ch] < 
3-2 _fc +P [B l sets bad] on page 19, which is also invalid. To see this, note that the 
game L only considers chains using subsequently the keys (ffy, K \ , K%), while 
the flag x2ch in the experiment R B can also be set by a chain for any cyclic 
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shift of this triple, e.g. (-K 2 , Ko, K x ). This is why a new multiplicative factor l 
appears in the security bound we have proved. 

In the version 3.0 of the paper @J, the second bug mentioned here was fixed, 
while the first is still present in a different form. Now the games G and Hg 
can be easily distinguished by forming a disconnected chain, for example by the 
following trivial adversary B: 

Adversary B 

Xi {0, l} n ; 

X 2 <— 17(1, £ 1 ); £3 <— II(2,X2); xo <— x' x <— 17(0, xo); 

if x x = x\ return 1 else return 0; 

This problem can be fixed by introducing the concept of disconnected chains 
and bounding the probability of them being constructed by the adversary, as we 
do for the general case of /-cascades in Lemma 0] 
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Abstract. In this paper, we prove classical coin-flipping secure in the 
presence of quantum adversaries. The proof uses a recent result of Wa- 
trous HU that allows quantum rewinding for protocols of a certain form. 
We then discuss two applications. First, the combination of coin-flipping 
with any non-interactive zero-knowledge protocol leads to an easy trans- 
formation from non-interactive zero-knowledge to interactive quantum 
zero-knowledge. Second, we discuss how our protocol can be applied to 
a recently proposed method for improving the security of quantum pro- 
tocols 0, resulting in an implementation without set-up assumptions. 
Finally, we sketch how to achieve efficient simulation for an extended 
construction in the common-reference-string model. 

Keywords, quantum cryptography, coin-flipping, common reference 
string, quantum zero-knowledge. 


1 Introduction 

In this paper, we are interested in a standard coin-flipping protocol with classical 
messages exchange but where the adversary is assumed to be capable of quantum 
computing. Secure coin-flipping allows two parties Alice and Bob to agree on a 
uniformly random bit in a fair way, i.e., neither party can influence the value of 
the coin to his advantage. The (well-known) protocol proceeds as follows: Alice 
commits to a bit a, Bob then sends bit b, Alice opens the commitment and the 
resulting coin is the exclusive disjunction of both bits, i.e. coin = a®b. 

For Alice’s commitment to her first message, we assume a classical bit com- 
mitment scheme. Intuitively, a commitment scheme allows a player to commit 
to a value, while keeping it hidden ( hiding property) but preserving the pos- 
sibility to later reveal the value fixed at commitment time ( binding property). 
More formally, a bit commitment scheme takes a bit and some randomness as 
input. The hiding property is formalized by the non-existence of a distinguisher 
able to distinguish with non-negligible advantage between a commitment to 0 
and a commitment to 1. The binding property is fulfilled, if it is infeasible for a 
forger to open one commitment to both values 0 and 1. The hiding respectively 
binding property holds with unconditional (i.e. perfect or statistical) security 
in the classical and the quantum setting, if the distinguisher respectively the 
forger is unrestricted with respect to his (quantum-) computational power. In 
case of a polynomial-time bounded classical distinguisher respectively forger, the 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 52 [eo] 2009. 
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commitment is computationally hiding respectively binding. The computation- 
ally hiding property translates to the quantum world by simply allowing the 
distinguisher to be quantum. However, the case of a quantum forger can not be 
handled in such a straightforward manner, due to the difficulties of rewinding in 
general quantum systems (see e.g. jl 21512 Uj for discussions). 

For our basic coin-flip protocol, we assume the commitment to be uncon- 
ditionally binding and computationally hiding against a quantum adversary^ 
Thus, we achieve unconditional security against cheating Alice and quantum- 
computational security against dishonest Bob. Such a commitment scheme 
follows, for instance, from any pseudorandom generator ca, secure against a 
quantum distinguisher. Even though the underlying computational assumption, 
on which the security of the embedded commitment is based, withstands quan- 
tum attacks, the security proof of the entire protocol and its integration into 
other applications could previously not be naturally translated from the clas- 
sical to the quantum world. Typically, security against a classical adversary is 
argued using rewinding of the adversary. But in general, rewinding as a proof 
technique cannot be directly applied, if Bob runs a quantum computer: First, 
the intermediate state of a quantum system cannot be copied EH , and second, 
quantum measurements are in general irreversible. Hence, in order to produce a 
classical output, the simulator had to (partially) measure the quantum system 
without copying it beforehand, but then it would become generally impossible 
to reconstruct all information necessary for correct rewinding. For these rea- 
sons, no simple and straightforward security proofs for the quantum case were 
previously known. 

In this paper, we show the most natural and direct quantum analogue of the 
classical security proof for standard coin-flipping, by using a recent result of Wa- 
trous m ■ Watrous showed how to construct an efficient quantum simulator for 
quantum verifiers for several zero- knowledge proof systems such as graph isomor- 
phism, where the simulation relies on the newly introduced quantum rewinding 
theorem. We now show that his quantum rewinding argument can also be applied 
to classical coin-flipping in a quantum world. 

By calling the coin-flip functionality sequentially a sufficient number of times, 
the communicating parties can interactively generate a common random string 
from scratch. The generation can then be integrated into other (classical or quan- 
tum) cryptographic protocols that work in the common-reference-string model. 
This way, several interesting applications can be implemented entirely in a simple 
manner without any set-up assumptions. Two example applications are discussed 
in the second part of the paper. 

The first application relates to zero-knowledge proof systems, an important 
building block for larger cryptographic protocols. Recently, Hallgren et al. (El 
showed that any honest verifier zero-knowledge protocol can be made zero- 
knowledge against any classical and quantum verifier. Here we show a related 


Recall that unconditionally secure commitments, i.e. unconditionally hiding and 
binding at the same time, are impossible in both the classical and the quantum 
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result, namely, a simple transformation from non-inter active (quantum) 
zero-knowledge to interactive quantum zero-knowledge. A non-interactive zero- 
knowledge proof system can be trivially turned into an interactive honest veri- 
fier zero-knowledge proof system by just letting the verifier choose the reference 
string. Therefore, this consequence of our result also follows from m However, 
our proof is much simpler. In general, the difference between us and 1 1 is that 
our focus is on establishing coin-flipping as a stand-alone tool that can be used in 
several contexts rather than being integrated in a zero-knowledge construction 
as in [T3! ■ 

As second application we discuss the interactive generation of a common ref- 
erence string for the general compiler construction improving the security of a 
large class of quantum protocols that was recently proposed in [0 . Applying the 
compiler, it has been shown how to achieve hybrid security in existing protocols 
for password-based identification P and oblivious transfer P without significant 
efficiency loss, such that an adversary must have both large quantum memory 
and large computing power to break the protocol. Here we show how a common 
reference string for the compiler can be generated from scratch according to the 
specific protocol requirements in pjj . 

Finally, we sketch an extended commitment scheme for quantum-secure coin- 
flipping in the common-reference-string model. This construction can be effi- 
ciently simulated without the need of rewinding, which is necessary to claim 
universal composability. 


2 Preliminaries 

2.1 Notation 

We assume the reader’s familiarity with basic notation and concepts of quantum 
information processing as in standard literature, e.g. [H3|. Furthermore, we will 
only give the details of the discussed applications that are most important in 
the context of this work. A full description of the applications can be found in 
the referenced papers. 

We denote by negl(n) any function of n, if for any polynomial p it holds that 
negl(n) < 1 /p(n) for large enough n. As a measure of closeness of two quantum 
states p and a, their trace distance 6(p, o) = | tr(|p— oj) or square-fidelity (p\cr\p) 
can be applied. A quantum algorithm consists of a family {C rl }„eN of quantum 
circuits and is said to run in polynomial time, if the number of gates of C n is 
polynomial in n. Two families of quantum states {p n }neN and {cr„}„ e N are called 
quantum- computationally indistinguishable, denoted p ss a, if any polynomial- 
time quantum algorithm has negligible advantage in n of distinguishing p n from 
u n . Analogously, they are statistically indistinguishable, denoted p« n, if their 
trace distance is negligible in n. For the reverse circuit of quantum circuit Q, we 
use the standard notation for the transposed, complex conjugate operation, i.e. 
Qt. The controlled-NOT operation (CNOT) with a control and a target qubit 
as input flips the target qubit, if the control qubit is 1. In other words, the value 
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of the second qubit corresponds to the classical exclusive disjunction (XOR). A 
phase-flip operation can be described by Pauli operator Z. For quantum state p 
stored in register R we write | p) R . 

2.2 Definition of Security 

We follow the framework for defining security which was introduced in (Hj and 
also used in 0 . Our cryptographic two-party protocols run between player Al- 
ice, denoted by A, and player Bob (B). Dishonest parties are indicated by A* 
and B*, respectively. The security against a dishonest player is based on the 
real/ideal-world paradigm that assumes two different worlds: The real-world that 
models the actual protocol IT and the ideal-world based on the ideal function- 
ality T that describes the intended behavior of the protocol. If both executions 
are indistinguishable, security of the protocol in real life follows. In other words, 
a dishonest real-world player P* that attacks the protocol cannot achieve (sig- 
nificantly) more than an ideal-world adversary P* attacking the corresponding 
ideal functionality. 

More formally, the joint input state consists of classical inputs of honest 
parties and possibly quantum input of dishonest players. A protocol II con- 
sists of an infinite family of interactive (quantum) circuits for parties A and 
B. A classical (non-reactive) ideal functionality T is given by a conditional 
probability distribution PF(in A ,in B )\in A in B , inducing a pair of random variables 
(out a, outs) = F (in a, ins) for every joint distribution of in a and ins , where 
inp and outp denote party P’s in- and output, respectively. For the definition 
of (quantum-) computational security against a dishonest Bob, a polynomial- 
size (quantum) input sampler is considered, which produces the input state of 
the parties. 

Definition 2.1 (Correctness). A protocol II correctly implements an ideal 
classical functionality T , if for every distribution of the input values of hon- 
est Alice and Bob, the resulting common outputs of II and T are statistically 
indistinguishable . 

Definition 2.2 (Unconditional security against dishonest Alice). A pro- 
tocol IT implements an ideal classical functionality T unconditionally securely 
against dishonest Alice, if for any real-world adversary A*, there exists an ideal- 
world adversary A* , such that for any input state it holds that the output state, 
generated by A* through interaction with honest B in the real-world, is statisti- 
cally indistinguishable from the output state, generated by A* through interaction 
with T and A* in the ideal-world. 

Definition 2.3 ((Quantum-) Computational security against dishonest 
Bob). A protocol IT implements an ideal classical functionality T (quantum-) 
computationally securely against dishonest Bob, if for any (quantum-) computa- 
tionally bounded real-world adversary B* , there exists a ( quantum- ) computation- 
ally bounded ideal-world adversary B*, such that for any efficient input sampler, 
it holds that the output state, generated by B* through interaction with honest A 
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in the real-world, is (quantum-) computationally indistinguishable from the out- 
put state, generated by B* through interaction with T and B* in the ideal-world. 

For more details and a definition of indistinguishability of quantum states, 
see jB|. There, it has also been shown that protocols satisfying the above defini- 
tions compose sequentially in a classical environment. Furthermore, note that in 
Definition o we do not necessarily require the ideal-world adversary A* to be 
efficient. We show in Section 0 how to extend our coin-flipping construction such 
that we can achieve an efficient simulator. 

The coin-flipping scheme in Section El as well as the example applications in 
Sections 14. II and H~2I work in the common-reference-string (CRS) model. In this 
model, all participants in the real-world protocol have access to a classical public 
CRS, which is chosen before any interaction starts, according to a distribution 
only depending on the security parameter. However, the participants in the ideal- 
world interacting with the ideal functionality do not make use of the CRS. Hence, 
an ideal-world simulator P* that operates by simulating a real-world adversary 
P* is free to choose a string in any way he wishes. 


3 Quantum-Secure Coin-Flipping 

3.1 The Coin-Flip Protocol 

Let n indicate the security parameter of the commitment scheme which underlies 
the protocol. We use an unconditionally binding and quantum- computationally 
hiding commitment scheme that takes a bit and some randomness r of length 

1 as input, i.e. com : (0, 1} X {0, l} 1 — » {0, 1} (+1 . The unconditionally binding 
property is fulfilled, if it is impossible for any forger to open one commitment to 
both 0 and 1, i.e. to compute r,r' such that com(0. r) = 00772 ( 1 ,/). Quantum- 
computationally hiding is ensured, if no quantum distinguisher can distinguish 
between com(0 , r) and com(l, r') for random r, r' with non-negligible advantage. 
As mentioned earlier, for a specific instantiation we can use, for instance, Naor’s 
commitment based on a pseudorandom generator H5J. This scheme does not 
require any initially shared secret information and is secure against a quantum 
distinguisher^ 

We let Alice and Bob run the Coin — Flip Protocol (see Fig.GJ, which inter- 
actively generates a random and fair com in one execution and does not require 
any set-up assumptions. Correctness is obvious by inspection of the protocol: If 
both players are honest, they independently choose random bits. These bits are 
then combined via exclusive disjunction, resulting in a uniformly random coin. 

The corresponding ideal coin-flip functionality Tcoin is described in Figure 0 
Note that dishonest A* may refuse to open com(a,r) in the real-world after 
learning B’s input. For this case, Tcoin allows her a second input refuse, leading 
to output fail and modeling the abort of the protocol. 

2 We describe the commitment scheme in this simple notation. However, if it is based 
on a specific scheme, e.g. uni, the precise notation has to be slightly adapted. 
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Coin — Flip Protocol 

1. A chooses a €.r {0, 1} and computes com(a,r). She sends com(a,r) to B. 

2. B chooses 6 €je {0, 1} and sends b to A. 

3. A sends open(a,r) and B checks if the opening is valid. 

4. Both compute coin = a © b. 


Fig. 1. The Coin-Flip Protocol 


Ideal Functionality Jxoin: 

Upon receiving requests start from Alice and Bob, Tcoin outputs a uniformly 
random coin to Alice. It then waits to receive Alice’s second input ok or refuse 
and outputs coin or fail to Bob, respectively. 


Fig. 2. The Ideal Coin-Flip Functionality 


3.2 Security 

Theorem 3.1. The Coin — Flip Protocol is unconditionally secure against 
any unbounded dishonest Alice according to Definition \2.1A provided that the 
underlying commitment scheme is unconditionally binding. 

Proof. We construct an ideal- world adversary A*, such that the real output of 
the protocol is statistically indistinguishable from the ideal output produced by 
A*, ^coin and A*. 

First note that a,r and com(a,r ) are chosen and computed as in the real 
protocol. From the statistically binding property of the commitment scheme, it 
follows that A*’s choice bit a is uniquely determined from com(a, r), since for any 
com, there exists at most one pair (a, r) such that com = com(a, r ) (except with 
probability negligible in n). Hence in the real-world, A* is unconditionally bound 
to her bit before she learns B’s choice bit, which means a is independent of b. 
Therefore in Step 0 the simulator can correctly (but not necessarily efficiently) 
compute a (and r). Note that, in the case of unconditional security, we do not 
have to require the simulation to be efficient. We show in Section 0 how to 
extend the commitment in order to extract A*’s inputs efficiently. Finally, due 
to the properties of XOR, A* cannot tell the difference between the random b 
computed (from the ideal, random coin ) in the simulation in Step 0 and the 
randomly chosen b of the real-world. It follows that the simulated output is 
statistically indistinguishable from the output in the real protocol. □ 

To prove security against any dishonest quantum-computationally bounded B*, 
we show that there exists an ideal-world simulation B* with output quantum- 
computationally indistinguishable from the output of the protocol in the 
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Ideal — World Simulation A*: 

1. Upon receiving com(a , r) from A*, A* sends start and then ok to Tcoin as first 
and second input, respectively, and receives a uniformly random coin. 

2. A* computes a and r from com(a. r). 

3. A* computes b = coin © a and sends 6 to A* . 

4. A* waits to receive A*’s last message and outputs whatever A* outputs. 


Fig. 3. The Ideal- World Simulation A* 


real-world. In a classical simulation, where we can simply use rewinding, a 
polynomial-time simulator works as follows. It inquires coin from Tcoin, chooses 
random a and r, and computes b' = coin ® a as well as com(a, r). It then sends 
com(a,r) to B* and receives B*’s choice bit b. If b = 6', the simulation was suc- 
cessful. Otherwise, the simulator rewinds B* and repeats the simulation. Note 
that our security proof should hold also against any quantum adversary. The 
polynomial-time quantum simulator proceeds similarly to its classical analogue 
but requires quantum registers as work space and relies on the quantum rewind- 
ing lemma of Watrous m (see Lemma G] in Appendix EJ. 

In the paper, Watrous proves how to construct a quantum zero-knowledge 
proof system for graph isomorphism using his (ideal) quantum rewinding lemma. 
The protocol proceeds as a T-protocol, i.e. a protocol in three-move form, where 
the verifier flips a single coin in the second step and sends this challenge to the 
prover. Since these are the essential aspects also in our Coin — Flip Protocol, 
we can apply Watrous’ quantum rewinding technique (with slight modifications) 
as a black-box to our protocol. We also follow his notation and line of argument 
here. For a more detailed description and proofs, we refer to [2U| . 

Theorem 3.2. The Coin — Flip Protocol is quantum-computationally secure 
against any polynomial-time bounded, dishonest Bob according to Dehnition M.tA 
provided that the underlying commitment scheme is quantum-computationally 
hiding and the success probability of quantum rewinding achieves a non-negligible 
lower bound po. 

Proof. Let W denote B*’s auxiliary input register, containing an h-qubit state 
\if). Furthermore, let V and B denote B*’s work space, where V is an arbitrary 
polynomial-size register and B is a single qubit register. A’s classical messages 
are considered in the following as being stored in quantum registers A\ and 
A' 2 . In addition, the quantum simulator uses registers R, containing all possible 
choices of a classical simulator, and G, representing its guess b' on B*’s message 
b in the second step. Finally, let X denote a working register of size k, which is 
initialized to the state |0 fc ) and corresponds to the collection of all registers as 
described above except W. 
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The quantum rewinding procedure is implemented by a general quantum cir- 
cuit R C oin with input (W, X, B*, coin). As a first step, it applies a unitary (h. k)- 
quantum circuit Q to (W,X) to simulate the conversation, obtaining registers 
(G,Y). Then, a test takes place to observe whether the simulation was suc- 
cessful. In that case, R CO i n outputs the resulting quantum register. Otherwise, 
it quantumly rewinds by applying the reverse circuit Q' on ( G,Y ) to retrieve 
( W , X) and then a phase-flip transformation on X before another iteration of Q 
is applied. Note that R co in is essentially the same circuit as R described in (23 , 
but in our application it depends on the value of a given coin, i.e., we apply 
Ro or R\ for coin = 0 or coin = 1, respectively. In more detail, Q transforms 
(W, X) to (G, Y) by the following unitary operations: 

(1) It first constructs the superposition 

-|== Y l a ’ r )fll c om(a, r)) Ai \b'= coin © a) G |open(a, r)) M \0) B \o k '^ v \^) w , 

where k' < k. Note that the state of registers (Ai . G, A?) corresponds to a 
uniform distribution of possible transcripts of the interaction between the 
players. 

(2) For each possible com{a,r), it then simulates B*’s possible actions by apply- 
ing a unitary operator to (W, V, B, Ai) with A\ as control: 

Y l a ’ r) R \com(a, r)) Ai \b') G \open(a, r)) M \b) B \j>) v \i) w , 

where </> and p) describe modified quantum states. 

(3) Finally, a CNOT-operation is applied to pair ( B , G) with B as control to 
check whether the simulator’s guess of B*’s choice was correct. The result of 
the CNOT-operation is stored in register G. 

Y I' °> r > R I 1 com ( a > r ) ) a 1 I 1 b> © b ) G I 1 open(a, r ) } A2 1< b) B | ^ | ^ ^ . 

If we denote with Y the register that contains the residual h+k — 1 -qubit state, 
the transformation from (W,X) to (G, Y) by applying Q can be written as 

Q {\^w\° k ) x ) = Vp\°)G\<l>9ood(ip)) Y + » 

where 0 < p < 1 and |^ SO od(V0) denotes the state, we want the system to be 
in for a successful simulation. R co in then measures the qubit in register G with 
respect to the standard basis, which indicates success or failure of the simulation. 
A successful execution (where b = b') results in outcome 0 with probability p. In 
that case, R co in outputs Y. A measurement outcome 1 indicates 6^6', in which 
case Rcoin quantumly rewinds the system, applies a phase- flip (on register X) 
and repeats the simulation, i.e. 
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<2(2(1® |fl*)(o*|) 

Watrous’ ideal quantum rewinding lemma (without perturbations) then states 
the following: Under the condition that the probability p of a successful sim- 
ulation is non- negligible and independent of any auxiliary input, the output 
p(ip) of R has square-fidelity close to 1 with state \<t>good($)) of a successful 
simulation, i.e., 

{^good(^)\p(lp)\^good(fd > 1 ~S 

with error bound 0 < e < |. Note that for the special case where p equals 1/2 
and is independent of | ip), the simulation terminates after at most one rewinding. 

However, we cannot apply the exact version of Watrous’ rewinding lemma in our 
simulation, since the commitment scheme in the protocol is only (quantum-) com- 
putationally hiding. Instead, we must allow for small perturbations in the quan- 
tum rewinding procedure as follows. Let adv denote B*’s advantage over a random 
guess on the committed value due to his computing power, i.e. adv = |p — 1/2|. 
From the hiding property, it follows that adv is negligible in the security param- 
eter n. Thus, we can argue that the success probability p is close to independent 
of the auxiliary input and Watrous’ quantum rewinding lemma with small pertur- 
bations, as stated in the appendix (Lemma HJ, applies with Q = \ and e = adv. 
All operations in Q can be performed by polynomial-size circuits, and thus, the 
simulator has polynomial size (in the worst case). Furthermore, for negligible £ 
but non-negligible lower bound po on the success probability p, it follows that the 
“closeness” of output p(pip) with good state | <t> g ood(i>)) is slightly reduced but quan- 
tum rewinding remains possible. 

Finally, to proof security against quantum B*, we construct an ideal- world 
quantum simulator B* (see Fig. EJ, interacting with B* and the ideal func- 
tionality .Fco in and executing Watrous’ quantum rewinding algorithm. We then 
compare the output states of the real process and the ideal process. In case of 
indistinguishable outputs, quantum-computational security against B* follows. 


Ideal — World Simulation B*: 

1. B* gets B*’s auxiliary quantum input W and working registers X. 

2. B* sends start and then ok to Tcoin- It receives a uniformly random coin. 

3. Depending on the value of coin, B* applies the corresponding circuit R co in 
with input W. X, B* and coin. 

4. B* receives output register Y with \(f)good(?P)} and “measures the conversation” 
to retrieve the corresponding ( com(a , r), b, open(a, r)). It outputs whatever B* 
outputs. 


Fig. 4. The Ideal- World Simulation B* 


Quantum-Secure Coin-Flipping and Applications 


61 


First note that the superposition constructed as described above in circuit Q 
as Step (1) corresponds to all possible random choices of values in the real pro- 
tocol. Furthermore, the circuit models any possible strategy of quantum B* in 
Step (2), depending on control register \com(a, r)) Ai . The CNOT-operation on 
( B , G) in Step (3), followed by a standard measurement of G, indicate whether 
the guess b' on B*’s choice b was correct. If that was not the case (i.e. b ^ b' 
and measurement result 1), the system gets quantumly rewound by applying re- 
verse transformations (3)-(l), followed by a phase-flip operation. The procedure 
is repeated until the measurement outcome is 0 and hence b=b' . Watrous’ tech- 
nique then guarantees that, assuming negligible e and non-negligible po, then s' 
is negligible and thus, the final output p(ip) of the simulation is close to good 
state \4> goodie)) ■ It follows that the output of the ideal simulation is indistin- 
guishable from the output in the real-world for any quantum-computationally 
bounded B*. □ 

4 Applications 

4.1 Interactive Quantum Zero-Knowledge 

Zero-knowledge proofs are an important building block for larger cryptographic 
protocols. The notion of (interactive) zero-knowledge (ZK) was introduced by 
Goldwasser et al. nn- Informally, ZK proofs for any NP language L yield no 
other knowledge to the verifier than the validity of the assertion proved, i.e. 
x £ L. Thus, only this one bit of knowledge is communicated from prover to 
verifier and zero additional knowledge. For a survey about zero-knowledge, see 
for instance (All (I . 

Blum et al. @ showed that the interaction between prover and verifier in any 
ZK proof can be replaced by sharing a short, random common reference string 
according to some distribution and available to all parties from the start of the 
protocol. Note that a CRS is a weaker requirement than interaction. Since all 
information is communicated mono-directional from prover to verifier, we do not 
have to require any restriction on the verifier. 

As in the classical case, where ZK protocols exist if one-way functions exist, 
quantum zero-knowledge (QZK) is possible under the assumption that quantum 
one-way functions exist. In im. Kobayashi showed that a common reference 
string or shared entanglement is necessary for non-interactive quantum zero- 
knowledge. Interactive quantum zero-knowledge protocols in restricted settings 
were proposed by Watrous in the honest verifier setting [El and by Damgard et 
al. in the CRS model |S|, where the latter introduced the first 47- protocols for 
QZK withstanding even active quantum attacks. In |2( II . Watrous then proved 
that several interactive protocols are zero-knowledge against general quantum 
attacks. 

Recently, Hallgren et al. showed how to transform a T’-protocol with 
stage-by-stage honest verifier zero-knowledge into a new 17-protocol that is zero- 
knowledge against all classical and quantum verifiers. They propose special bit 
commitment schemes to limit the number of rounds, and view each round as a 
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IQZK^o'n Protocol: 


(COIN) 


1. A and B invoke Tcoin k times. If A blocks any output coini for i = 1 ..... fc 

(by sending refuse e 
(CRS) 

is second input), B aborts the protocol. 

2. A and B compute u. 

> = coini . . . coink • 

(NIZK) 


3. A sends ir(u,x) to B. B checks the proof and accepts or rejects accordingly. 


Fig. 5. Intermediate Protocol for IQZK 


stage in which an honest verifier simulator is assumed. Then, by using a technique 
of (2|, each stage can be converted to obtain zero-knowledge against any classical 
verifier. Finally, Watrous’ quantum rewinding lemma is applied in each stage to 
prove zero- knowledge also against any quantum verifier. 

Here, we propose a simpler transformation from non-interactive (quantum) 
zero-knowledge (NIZK) to interactive quantum zero-knowledge (IQZK) by com- 
bining the Coin — Flip Protocol with any NIZK Protocol. Our coin-flipping 
generates a truly random coin even in the case of a malicious quantum verifier. 
A sequence of such coins can then be used in any subsequent NIZK Protocol, 
which is also secure against quantum verifiers, due to its mono-direction. Here, 
we define a (NIZK)-subprotocol as given in (2j: Both parties A and B get com- 
mon input x. A common reference string uj of size k allows the prover A, 
who knows a witness w, to give a non-interactive zero-knowledge proof x ) to 

a (quantum-) computationally bounded verifier B. By definition, the 
(NIZK)-subprotocol is complete and sound and satisfies zero- knowledge. 

The IQZK Protocol is shown in Figure 0 To prove that it is an interactive 
quantum zero-knowledge protocol, we first construct an intermediate 
IQZK' ?rcolN Protocol (see Fig. 0) that runs with the ideal functionality Tcoin- 
Then we prove that the IQZK-^ 01 * Protocol satisfies completeness, soundness 
and zero-knowledge according to standard definitions. Finally, by replacing the 
calls to .Fcoin with our Coin — Flip Protocol, we can complete the transfor- 
mation to the final IQZK Protocol. 

Completeness: If x £ L, the probability that (A, B) rejects x is negligible in the 
length of x. 

From the ideal functionality Tcoin it follows that each com* in Step 0 is 
uniformly random for all * = 1, ... ,k. Hence, u> in Step El is a uniformly random 
common reference string of size k. By definition of any (NIZK)-subprotocol, we 
have acceptance probability 

Pr[u> Gr {0, l} fe , 7 r(w, x) *— A(a>, x, w ) : B(uj, x, i r(o;, a:)) = 1] > 1 — e", 
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where e" is negligible in the length of x. Thus, completeness for the 
IQZK :Fcoin Protocol follows. 

Soundness: If x ^ L, then for any unbounded prover A*, the probability that 
(A*, B) accepts x is negligible in the length of x. 

Any dishonest A* might stop the IQZK^ rcolN Protocol at any point during 
execution. For example, she can block the output in Step [I] or she can refuse to 
send a proof n in the (NIZK)-subprotocol. Furthermore, A* can use an invalid oj 
(or x) for 7 r. In all of these cases, B will abort without even checking the proof. 
Therefore, A*’s best strategy is to “play the entire game”, i.e. to execute the 
entire IQZK^ 001 * Protocol without making obvious cheats. 

A* can only convince B in the (NIZK)-subprotocol of a 7r for any given (i.e. 
normally generated) oj with negligible probability 

Pr[u e* {0,l} fc ,7r(w,iE) - A*{oj,x) : B(uj, x, n(oj, x)) = 1] . 

Therefore, the probability that A* can convince B in the entire IQZK :FcolN Protocol 
in case of x £ L is also negligible (in the length of a;) and its soundness follows. 

Zero-Knowledge: An interactive proof system (A, B*) for language L is quan- 
tum zero-knowledge, if for any quantum verifier B*, there exists a simulator 
Siqzk^coinj such that S iqzK ^ C0IN « (A, B*) on common input x G L and arbitrary 
additional (quantum) input to B*. 

We construct simulator S IQZK ^ C0IN , interacting with dishonest B* and simulator 
Snizk- Under the assumption on the zero-knowledge property of any NIZK Protocol, 
there exists a simulator Snizk that, on input x £ L, generates a randomly looking 
oj together with a valid proof 7r for x (without knowing witness w). S lqzK ^ C0IN is de- 
scribed in Figure 0 It receives a random string oj from Snizk, which now replaces 
the string of coins produced by the calls to .Fcoin in the IQZK :Fcoin Protocol. 
The “merging” of coins into oj in Step 0 of the protocol (Fig. 0) is equivalent 
to the “splitting” of oj into coins in Step 0 of the simulation (Fig. 0. Thus, the 
simulated proof tt(oj,x) is indistinguishable from a real proof, which shows that 
the IQZK- ?7coin Protocol is zero-knowledge. 


^IQZK^COIN : 

1- S IQZK ^ C0IN gets input x. 

2. It invokes Sum with x and receives 

3. Let oj = mim . . . coink ■ S lpzK ^ C0IN sends each coim one by one to B*. 

4. S I()ZK ^ C0IN sends n(oj, x ) to B* and outputs whatever B* outputs. 


Fig. 6. The Simulation of the Intermediate Protocol for IQZK 
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IQZK Protocol: 

(CFP) For alii = 1, . . . , k repeat Steps 1.-4. 

1. A chooses a* £r {0, 1} and computes com (at, n). She sends com(o.i, n) to B. 

2. B chooses bi £ r {0, 1} and sends to A. 

3. A sends open(di,ri) and B checks if the opening is valid. 

4. Both compute coin , = a,i © b z . 

(CRS) 

5. A and B compute uj = coini . . . coink ■ 

(NIZK) 

6. A sends n(u,x) to B. B checks the proof and accepts or rejects accordingly. 


Fig. 7. Interactive Quantum Zero-Knowledge 


It would be natural to think that the IQZK Protocol could be proved secure 
simply by showing that the IQZK^ 01 ” Protocol implements some appropriate 
functionality and then use the composition theorem from 0. Unfortunately, a 
zero-knowledge protocol - which is not necessarily a proof of knowledge - cannot 
be modeled by a functionality in a natural way. We therefore instead prove ex- 
plicitly that the IQZK Protocol has the standard properties of a zero-knowledge 
proof as follows. 

Completeness: From the analysis of the Coin — Flip Protocol and its indistin- 
guishability from the ideal functionality JFcoin, it follows that if both players hon- 
estly choose random bits, each coini for all i = 1, . . . , k in the (CFP)-subprotocol 
is generated uniformly at random. Thus, u> is a random common reference string 
of size k and the acceptance probability of the (NIZK)-subprotocol as given above 
holds. Completeness for the IQZK Protocol follows. 

Soundness: Again, we only consider the case where A* executes the entire 
protocol without making obvious cheats, since otherwise, B immediately aborts. 
Assume that A* could cheat in the IQZK Protocol, i.e., B would accept an invalid 
proof with non-negligible probability. Then we could combine A* with simulator 
A* of the Coin — Flip Protocol (Fig. ® to show that the IQZK^ rcolN Protocol 
was not sound. This, however, is inconsistent with the previously given soundness 
argument and thus proves by contradiction that the IQZK Protocol is sound. 

Zero-Knowledge: A simulator Siq Z K can be composed of simulator S lqzK ;r C0IN 
(Fig. EJ) and simulator B* for the Coin — Flip Protocol (Fig. ®. SiqzK gets 
classical input x as well as quantum input W and X. It then receives a valid proof 
7 r and a random string oj from S N izk- As in S IQZK ^ C0IN , to is split into coini ■ ■ ■ coink- 
For each coini, it will then invoke B* to simulate one coin- flip execution with 
coini as result. In other words, whenever B* asks J^coin to output a bit (Step 0 
Fig.®, it instead receives this coini. The transcript of the simulation, i.e. x ) 
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as well as ( com(ai , r*), openfai , r^)) Vi = 1, . . . , k and ui = cmn\ . . . coink, is 
indistinguishable from the transcript of the IQZK Protocol for any quantum- 
computationally bounded B*, which concludes the zero-knowledge proof. 


4.2 Generating Commitment Keys for Improved Quantum 
Protocols 

Recently, Damgard et al. jlj proposed a general compiler for improving the se- 
curity of a large class of quantum protocols. Alice starts such protocols by trans- 
mitting random BB84-qubits to Bob who measures them in random bases. Then 
some classical messages are exchanged to accomplish different cryptographic 
tasks. The original protocols are typically unconditionally secure against cheat- 
ing Alice, and secure against a so-called benignly dishonest Bob, i.e., Bob is 
assumed to handle most of the received qubits as he is supposed to. Later on in 
the protocol, he can deviate arbitrarily. The improved protocols are then secure 
against an arbitrary computationally bounded (quantum) adversary. The com- 
pilation also preserves security in the bounded-quantum-storage model (BQSM) 
that assumes the quantum storage of the adversary to be of limited size. If the 
original protocol was BQSM-secure, the improved protocol achieves hybrid secu- 
rity, i.e., it can only be broken by an adversary who has large quantum memory 
and large computing power. 

Briefly, the argument for computational security proceeds along the following 
lines. After the initial qubit transmission from A to B, B commits to all his 
measurement bases and outcomes. The (keyed) dual-mode commitment scheme 
that is used must have the special properties that the key can be generated 
by one of two possible key-generation algorithms: Gh or Gb- Depending of the 
key in use, the scheme provides both flavors of security. Namely, with key pkH 
generated by Gh, respectively pkB produced by Gb, the commitment scheme is 
unconditionally hiding respectively unconditionally binding. Furthermore, the 
scheme is secure against a quantum adversary and it holds that pkH « pkB. The 
commitment construction is described in full detail in K) ■ 

In the real-life protocol, B uses the unconditionally hiding key pkH to main- 
tain unconditional security against any unbounded A*. To argue security against 
a computationally bounded B*, an information-theoretic argument involving 
simulator B' (see 0j) is given to prove that B* cannot cheat with the uncon- 
ditionally binding key pkB. Security in real life then follows from the quantum- 
computational indistinguishability of pkH and pkB. 

The CRS model is assumed to achieve high efficiency and practicability. Here, 
we discuss integrating the generation of a common reference string from scratch 
based on our quantum-secure coin-flipping. Thus, we can implement the entire 
process in the quantum world, starting with the generation of a CRS without any 
initially shared information and using it during compilation as commitment keyO 


Note that implementing the entire process comes at the cost of a non constant-round 
construction, added to otherwise very efficient protocols under the CRS-assumption. 
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As mentioned in , a dual-mode commitment scheme can be constructed from 
the lattice-based cryptosystem of Regev m It is based on the learning with 
error problem, which can be reduced from worst-case (quantum) hardness of the 
(general) shortest vector problem. Hence, breaking Regev’s cryptosystem implies 
an efficient algorithm for approximating the lattice problem, which is assumed to 
be hard even quantumly. Briefly, the cryptosystem uses dimension k as security 
parameter and is parametrized by two integers m and p, where p is a prime, 
and a probability distribution on Z p . A regular public key for Regev’s scheme is 
indistinguishable from a case where a public key is chosen independently from 
the secret key, and in this case, the ciphertext carries essentially no information 
about the message. Thus, the public key of a regular key pair can be used as the 
unconditional binding key pkB' in the commitment scheme for the ideal-world 
simulation. Then for the real protocol, an unconditionally hiding commitment 
key pkH' can simply be constructed by uniformly choosing numbers in Z p x Z p . 
Both public keys will be of size 0(mk log p), and the encryption process involves 
only modular additions, which makes its use simple and efficient. 

The idea is now the following. We add (at least) k executions of our 
Coin — Flip Protocol as a first step to the construction of 0 to generate 
a uniformly random sequence coini ■ ■ ■ coink ■ These k random bits produce a 
pkH' as sampled by Gh, except with negligible probability. Hence, in the real- 
world, Bob can use coini ■ ■ ■ coink = pkH' as key for committing to all his basis 
choices and measurement outcomes. Since an ideal- world adversary B 7 is free 
to choose any key, it can generate (pkB 7 , sk 7 ), i.e. a regular public key together 
with a secret key according to Regev’s cryptosystem. For the security proof, 
write pkB 7 = coini ■ ■ ■ coink ■ In the simulation, B 7 first invokes B* for each coini 
to simulate one coin-flip execution with coini as result. As before, whenever B* 
asks ^coin to output a bit, it instead receives this coini. Then B 7 has the possi- 
bility to decrypt dishonest B* ’s commitments during simulation, which binds B* 
unconditionally to his committed measurement bases and outcomes. Finally, as 
we proved in the analysis of the Coin — Flip Protocol that pkH 7 is a uniformly 
random string, Regev’s proof of semantic security shows that pkH 7 ss pkB 7 , and 
(quantum-) computational security of the real protocols in j3j follows. 

5 On Efficient Simulation in the CRS Model 

For our Coin — Flip Protocol in the plain model, we cannot claim universal 
composability. As already mentioned, in case of unconditional security against 
dishonest A* according to Definition 12.21 we do not require the simulator to be 
efficient. In order to achieve efficient simulation, A* must be able to extract the 
choice bit efficiently out of A*’s commitment, such that A*’s input is defined 
after this step. The standard approach to do this is to give the simulator some 
trapdoor information related to the common reference string, that A* does not 
have in real life. Therefore, we extend the commitment scheme to build in such 
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a trapdoor and ensure efficient extraction. To further guarantee UC-security, 
we circumvent the necessity of rewinding B* by extending the construction also 
with respect to equivocability. 

We will adapt an approach to our set-up, which is based on the idea of UC- 
commitments jS] and already discussed in the full version of jlj. We require a 
T-protocol for a (quantumly) hard relation R = {(a:, w)}, i.e. an honest verifier 
perfect zero-knowledge interactive proof of knowledge, where the prover shows 
that he knows a witness w such that the problem instance a; is in the language 
L ((x. us) G R). Conversations are of form (az,cz, zz), where the prover sends 
az, the verifier challenges him with bit Cz, and the prover replies with zz- For 
practical candidates of R, see e.g. 0. Instead of the simple commitment scheme, 
we use the keyed dual-mode commitment scheme described in Section 14.21 but 
now based on a multi-bit version of Regev’s scheme HZ;. Still we construct it 
such that depending of the key pkH or pkB, the scheme provides both flavors of 
security and it holds that pkH « pkB. 

In real life, the CRS consists of commitment key pkB and an instance x' for 
which it holds that $ w' such that (x' ,w') G R, where we assume that x « x'. 
To commit to bit a, A runs the honest verifier simulator to get a conversation 
(az,a, zz)- She then sends az and two commitments co,ci to B, where c a = 
com^R{zz-,r) and c\- a = com pkB (O z , r') with randomness r,r' and z' = \z\. 
Then, a,zz,r is send to open the relevant one of co or ci, and B checks that 
(az, a, zz) is an accepting conversation. Assuming that the ^-protocol is honest 
verifier zero-knowledge and pkB leads to unconditionally binding commitments, 
the new commitment construction is again unconditionally binding. 

During simulation, A* chooses a pkB in the CRS such that it knows the match- 
ing decryption key sk. Then, it can extract A*’s choice bit a by decrypting both 
co and c\ and checking which contains a valid zz such that (az,a,Zz) is ac- 
cepting. Note that not both co and c\ can contain a valid reply, since otherwise, 
A* would know a w' such that (x’,w') G R. In order to simulate in case of 
B" , B* chooses the CRS as pkH and x. Hence, the commitment is uncondition- 
ally hiding. Furthermore, it can be equivocated, since 3 w with (x, w) G R. and 
therefore, co, c\ can both be computed with valid replies, i.e. co = com pk u(zoz ■ r) 
and ci = corri pkH (zj z . r'). Quantum-computational security against B* follows 
from the indistinguishability of the keys pkB and pkH and the indistinguishablity 
of the instances x and x' , and efficiency of both simulations is ensured due to 
extraction and equivocability. 
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A Watrous’ Quantum Rewinding Lemma 

Lemma 1 (Quantum Rewinding Lemma with small perturbations |20| ) . 

Let Q be the unitary ( h,k)-quantum circuit as given in Wdf . Furthermore, let 

Po, q G (0, 1) and e G (0, |) be real numbers such that 

1. \p-q\ <e 

2. p 0 (l - po) < q{l - q), and 

3. p 0 <p 

for all h-qubit states \ip). Then there exists a general quantum circuit R of size 
( log{l/e)size{Q) \ 

V po(l-po) ) 

such that, for every h-qubit state | ip), the output p(ip) of R satisfies 

{4>good{i))\pW\(t>goodW) > 1 ~ e' 


where s’ = 16s ^^ 2 . 

Note that po denotes the lower bound on the success probability p, for which 
the procedure guarantees correctness. Furthermore, for negligible e but non- 
negligible po, it follows that s’ is negligible. For a more detailed description of 
the lemma and the corresponding proofs, we refer to EDI. 
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Abstract. We study quantum protocols among two distrustful par- 
ties. Under the sole assumption of correctness — guaranteeing that hon- 
est players obtain their correct outcomes — we show that every protocol 
implementing a non-trivial primitive necessarily leaks information to a 
dishonest player. This extends known impossibility results to all non- 
trivial primitives. We provide a framework for quantifying this leakage 
and argue that leakage is a good measure for the privacy provided to the 
players by a given protocol. Our framework also covers the case where 
the two players are helped by a trusted third party. We show that de- 
spite the help of a trusted third party, the players cannot amplify the 
cryptographic power of any primitive. All our results hold even against 
quantum honest-but-curious adversaries who honestly follow the proto- 
col but purify their actions and apply a different measurement at the 
end of the protocol. As concrete examples, we establish lower bounds on 
the leakage of standard universal two-party primitives such as oblivious 
transfer. 

Keywords: two-party primitives, quantum protocols, quantum informa- 
tion theory, oblivious transfer. 


1 Introduction 

Quantum communication allows to implement tasks which are classically impos- 
sible. The most prominent example is quantum key distribution P] where two 
honest players establish a secure key against an eavesdropper. In the two-party 
setting however, quantum and classical cryptography often show similar limits. 
Oblivious transfer j22j, bit commitment f24l 2dj . and even fair coin tossing [IB] 
are impossible to realize securely both classically and quantumly. On the other 
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hand, quantum cryptography allows for some weaker primitives impossible in 
the classical world. For example, quantum coin-flipping protocols with maxi- 
mum bias of -j= — \ exislQ against any adversary jH| while remaining impossible 
based solely on classical communication. A few other weak primitives are known 
to be possible with quantum communication. For example, the generation of an 
additive secret-sharing for the product xy of two bits, where Alice holds bit x and 
Bob bit y, has been introduced by Popescu and Rohrlich as machines modeling 
non-signaling non-locality (also called NL-boxes) |22j. If Alice and Bob share 
an EPR pair, they can simulate an NL-box with symmetric error probability 
sin 2 ^ f2T)ldj . Equivalently, Alice and Bob can implement l-out-of-2 oblivious 
transfer (1-2-ot) privately provided the receiver Bob gets the bit of his choice 
only with probability of error sin 2 | P . It is easy to verify that even with such 
imperfection these two primitives are impossible to realize in the classical world. 
This discussion naturally leads to the following question: 

— Which two-party cryptographic primitives are possible to achieve using quan- 
tum communication? 

Most standard classical two-party primitives have been shown impossible to im- 
plement securely against weak quantum adversaries reminiscent to the classical 
honest-but-curious (HBC) behavior (22J- The idea behind these impossibility 
proofs is to consider parties that purify their actions throughout the protocol 
execution. This behavior is indistinguishable from the one specified by the pro- 
tocol but guarantees that the joint quantum state held by Alice and Bob at any 
point during the protocol remains pure. The possibility for players to behave that 
way in any two-party protocol has important consequences. For instance, the im- 
possibility of quantum bit commitment follows from this fact |24l2fij : After the 
commit phase, Alice and Bob share the pure state \rf x ) G 'Ha®'Hb corresponding 
to the commitment of bit x. Since a proper commitment scheme provides no in- 
formation about x to the receiver Bob, it follows that tr^ 'i/i°)('0° = tr^ | ?/> 1 ){ 1 1 . 
In this case, the Schmidt decomposition guarantees that there exists a unitary 
17o,i acting only on Alice’s side such that V )1 ) = (^o,i®lB)|V , °}- In other words, 
if the commitment is concealing then Alice can open the bit of her choice by 
applying a suitable unitary transform only to her part. A similar argument al- 
lows to conclude that 1-2-ot is impossible E: Suppose Alice is sending the 
pair of bits (bo, b i) to Bob through 1-2-ot. Since Alice does not learn Bob’s 
selection bit, it follows that Bob can get bit bo before undoing the reception of 
bo and transforming it into the reception of b\ using a local unitary transform 
similar to Uo,i for bit commitment. For both these primitives, privacy for one 
player implies that local actions by the other player can transform the honest 
execution with one input into the honest execution with another input. 

In this paper, we investigate the cryptographic power of two-party quan- 
tum protocols against players that purify their actions. This quantum honest- 
but-curious (QHBC) behavior is the natural quantum version of classical HBC 

1 In fact, protocols with better bias are known for weak quantum coin flip- 

ping 12512(11271 . 
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behavior. We consider the setting where Alice obtains random variable X and 
Bob random variable Y according to the joint probability distribution Px.y- 
Any Px,y models a two-party cryptographic primitive where neither Alice nor 
Bob provide input. For the purpose of this paper, this model is general enough 
since any two-party primitive with inputs can be randomized (Alice and Bob 
pick their input at random) so that its behavior can be described by a suitable 
joint probability distribution Px,y ■ If the randomized version Px.y is shown 
to be impossible to implement securely by any quantum protocol then also the 
original primitive with inputs is impossible. 

Any quantum protocol implementing Px.y must produce, when both parties 
purify their actions, a joint pure state | ip) £ Haa'YHbb' that, when subsystems 
of A and B are measured in the computational basis, leads to outcomes X and Y 
according the distribution Px,y ■ Notice that the registers A! and B' only provide 
the players with extra working space and, as such, do not contribute to the output 
of the functionality (so parties are free to measure them the way they want). 
In this paper, we adopt a somewhat strict point of view and define a quantum 
protocol 7r for Px.y to be correct if and only if the correct outcomes X, Y are 
obtained and the registers A! and B' do not provide any additional information 
about Y and X respectively since otherwise n would be implementing a different 
primitive Pxx'.yy 1 rather than Px.y- 

The state \ip) produced by any correct protocol for Px.y is called a quantum 
embedding of Px,y- An embedding is called regular if the registers A' and B' are 
empty. Any embedding | ip) £ Haa'®Hbb' can be produced in the QHBC model 
by the trivial protocol asking Alice to generate \ip) before sending the quantum 
state in Hbb 1 to Bob. Therefore, it is sufficient to investigate the cryptographic 
power of embeddings in order to understand the power of two-party quantum 
cryptography in the QHBC model. 

Notice that if X and Y were provided privately to Alice and Bob — through 
a trusted third party for instance — then the expected amount of information 
one party gets about the other party’s output is minimal and can be quantified 
by the Shannon mutual information I(X;Y) between X and Y. Assume that 
\ip) £ Ha A' ® Hbb 1 is the embedding of Px.y produced by a correct quantum 
protocol. We define the leakage of | ip) as 

:= max { S(X: BB') - J(X: Y ) , S(Y\ AA!) - I{Y ; X) } , (1) 

where S(X;BB') (resp. S(Y; AA')) is the information the quantum registers 
BB' (resp. AA 1 ) provide about the output X (resp. Y). That is, the leakage is the 
maximum amount of extra information about the other party’s output given the 
quantum state held by one party. It turns out that S(X; BB') = S(Y-, AA’) holds 
for all embeddings, exhibiting a symmetry similar to its classical counterpart 
I(X;Y) = IiY : X) and therefore, the two quantities we are taking the maximum 
of (in the definition of leakage above) coincide. 

Contributions. Our first contribution establishes that the notion of leakage 
is well behaved. We show that the leakage of any embedding for Px.y is lower 
bounded by the leakage of some regular embedding of the same primitive. Thus, 
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in order to lower bound the leakage of any correct implementation of a given 
primitive, it suffices to minimize the leakage over all its regular embeddings. We 
also show that the only non-leaking embeddings are the ones for trivial primi- 
tives, where a primitive Px.y is said to be (cryptographically) trivial if it can be 
generated by a classical protocol against HBC adversaries It follows that any 
quantum protocol implementing a non-trivial primitive Px,y must leak infor- 
mation under the sole assumption that it produces (X, Y) with the right joint 
distribution. This extends known impossibility results for two-party primitives 
to all non-trivial primitives. 

Embeddings of primitives arise from protocols where Alice and Bob have full 
control over the environment. Having in mind that any embedding of a non- 
trivial primitive leaks information, it is natural to investigate what tasks can be 
implemented without leakage with the help of a trusted third party. The notion 
of leakage can easily be adapted to this scenario. We show that no cryptographic 
two-party primitive can be implemented without leakage with just one call to the 
ideal functionality of a weaker primitive)^. This new impossibility result does not 
follow from the ones known since they all assume that the state shared between 
Alice and Bob is pure. 

We then turn our attention to the leakage of correct protocols for a few con- 
crete universal primitives. From the results described above, the leakage of any 
correct implementation of a primitive can be determined by finding the (regular) 
embedding that minimizes the leakage. In general, this is not an easy task since 
it requires to find the eigenvalues of the reduced density matrix pA = tvs IV’XV’I 
(or equivalently pB = tr a IV’XV’I)- As far as we know, no known results allow 
us to obtain a non-trivial lower bound on the leakage (which is the difference 
between the mutual information and accessible information) of non-trivial primi- 
tives. One reason being that in our setting we need to lower bound this difference 
with respect to a measurement in one particular basis. However, when Px,y is 
such that the bit-length of either X or Y is short, the leakage can be computed 
precisely. We show that any correct implementation of 1-2-OT necessarily leaks 
\ bit. Since NL-boxes and 1-2-OT are locally equivalent, the same minimal leak- 
age applies to NL-boxes This is a stronger impossibility result than the 
one by Lo m since he assumes perfect/statistical privacy against one party 
while our approach only assumes correctness (while both approaches apply even 
against QHBC adversaries). We finally show that for Rabin-OT and 1-2-ot of 
r-bit strings (i.e. ROT r and l-2-OT r respectively), the leakage approaches 1 ex- 
ponentially in r. In other words, correct implementations of these two primitives 
trivialize as r increases since the sender gets almost all information about Bob’s 

2 We are aware of the fact that our definition of triviality encompasses cryptograph- 
ically interesting primitives like coin-tossing and generalizations thereof for which 
highly non-trivial protocols exist |27ISj . However, the important fact (for the pur- 
pose of this paper) is that all these primitives can be implemented by trivial classical 
protocols against HBC adversaries. 

3 The weakness of a primitive will be formally defined in terms of entropic monotones 
for classical two-party computation introduced by Wolf and Wullschleger |3(i| . see 
Section 14.21 
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reception of the string (in case of ROT r ) and Bob’s choice bit (in case of l-2-OT r ). 
These are the first quantitative impossibility results for these primitives and cer- 
tainly the first time the hardness of implementing different flavors of string OTs 
is shown to increase as the strings to be transmitted get longer. 

Finally, we note that our lower bounds on the leakage of the randomized prim- 
itives also lower-bound the minimum leakage for the standard versions of these 
primitive^ where the players choose their inputs uniformly at random. While 
we focus on the typical case where the primitives are run with uniform inputs, 
the same reasoning can be applied to primitives with arbitrary distributions 
of inputs. 

Related Work. Our framework allows to quantify the minimum amount of 
leakage whereas standard impossibility proofs as the ones of \‘23mmii2r\ do 
not in general provide such quantification since they usually assume privacy 
for one player in order to show that the protocol must be totally insecure for 
the other playeiQ. By contrast, we derive lower bounds for the leakage of any 
correct implementation. At first glance, our approach seems contradictory with 
standard impossibility proofs since embeddings leak the same amount towards 
both parties. To resolve this apparent paradox it suffices to observe that in 
previous approaches only the adversary purified its actions whereas in our case 
both parties do. If a honest player does not purify his actions then some leakage 
may be lost by the act of irreversibly and unnecessarily measuring some of his 
quantum registers. 

Our results complement the ones obtained by Colbeck in EH for the set- 
ting where Alice and Bob have inputs and obtain identical outcomes (called 
single- function computations). jEj shows that in any correct implementation of 
primitives of a certain form, an honest-but-curious player can access more in- 
formation about the other party’s input than it is available through the ideal 
functionality. Unlike EH> we deal in our work with the case where Alice and 
Bob do not have inputs but might receive different outputs according to a joint 
probability distributions. We show that only trivial distributions can be imple- 
mented securely in the QHBC model. Furthermore, we introduce a quantitative 
measure of protocol-insecurity that lets us answer which embedding allow the 
least effective cheating. 

Another notion of privacy in quantum protocols, generalizing its classical 
counterpart from ISEH, is proposed by Klauck in US- Therein, two-party quan- 
tum protocols with inputs for computing a function / : X — > Z, where X and 
y denote Alice’s and Bob’s respective input spaces, and privacy against QHBC 

4 The definition of leakage of an embedding can be generalized to protocols with inputs, 
where it is defined as max{su PvB S(X; V B ) - I(X ; Y) , sup VA S{V A ; Y) - I(X ; Y)}, 
where X and Y involve both inputs and outputs of Alice and Bob, respectively. The 
supremum is taken over all possible (quantum) views Va and Vb of Alice and Bob 
obtained by their (QHBC-consistent) actions (and containing their inputs). 

5 Trade-offs between the security for one and the security for the other player have 
been considered before, but either the relaxation of security has to be very small 

or the trade-offs are restricted to particular primitives such as commitments l.'MKil . 


On the Power of Two-Party Quantum Cryptography 


75 


adversaries are considered. Privacy of a protocol is measured in terms of privacy 
loss, defined for each round of the protocol and fixed distribution of inputs Px',Y' 
by S(B ; X\Y) = H(X\Y) — S(X\B, Y ), where B denotes Bob’s private working 
register, and X := {X' , f(X' ,Y')), Y := (Y 1 , f(X', Y’)) represent the complete 
views of Alice and Bob, respectively. Privacy loss of the entire protocol is then 
defined as the supremum over all joint input distributions, protocol rounds, 
and states of working registers. In our framework, privacy loss corresponds to 
S(X; YB) - I(X: Y) from Alice point’s of view and S(Y: XA) - I(X: Y) from 
Bob’s point of view. Privacy loss is therefore very similar to our definition of 
leakage except that it requires the players to get their respective honest outputs. 
As a consequence, the protocol implementing Px,y by asking one party to pre- 
pare a regular embedding of Px,y before sending her register to the other party 
would have no privacy loss. Moreover, the scenario analyzed in [El is restricted 
to primitives which provide the same output f(X, Y) to both players. Another 
difference is that since privacy loss is computed over all rounds of a protocol, 
a party is allowed to abort which is not considered QHBC in our setting. In 
conclusion, the model of m is different from ours even though the measures of 
privacy loss and leakage are similar. m provides interesting results concerning 
trade-offs between privacy loss and communication complexity of quantum pro- 
tocols, building upon similar results of EEH in the classical scenario. It would be 
interesting to know whether a similar operational meaning can also be assigned 
to the new measure of privacy, introduced in this paper. 

A recent result by Kiinzler et al. m shows that two-party functions that are 
securely computable against active quantum adversaries form a strict subset of 
the set of functions which are securely computable in the classical HBC model. 
This complements our result that the sets of securely computable functions in 
both HBC and QHBC models are the same. 

Roadmap. In Section|2| we introduce the cryptographic and information-theoretic 
notions and concepts used throughout the paper. We define, motivate, and ana- 
lyze the generality of modeling two-party quantum protocols by embeddings in 
Section 0 and define triviality of primitives and embeddings. In Section 0 we de- 
fine the notion of leakage of embeddings, show basic properties and argue that it is 
a reasonable measure of privacy. In Section 0 we explicitly lower bound the leak- 
age of some universal two-party primitives. Finally, in Section0we discuss possible 
directions for future research and open questions. 

2 Preliminaries 

Quantum Information Theory. Let \^) AB e Hab be an arbitrary pure 
state of the joint systems A and B. The states of these subsystems are p A = 
tr B IV’XV’I and Pb = tr^ IV’XV’Ij respectively. We denote by S{A) := S{pa) and 
S{B) := S{p B ) the von Neumann entropy (defined as the Shannon entropy of 
the eigenvalues of the density matrix) of subsystem A and B respectively. Since 
the joint system is in a pure state, it follows from the Schmidt decomposition 
that S'(A) = S(B) (see e.g. |2%j). Analogously to their classical counterparts, we 
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can define quantum conditional entropy S(A\B) := S(AB) — S(B), and quantum 
mutual information S(A- B ) := 5(vl) + S(B) — S(AB) = S(A) — S(A\B). Even 
though in general, S'(vl|S) can be negative, S(A\B) > 0 is always true if A is 
a classical register. Let R = {(Px(x), p R } xe x be an ensemble of states p x R with 
prior probability Px(x). The average quantum state is p R = J2 x ex Px(x)p R . 
The famous result by Holevo upper-bounds the amount of classical information 
about X that can be obtained by measuring p R : 

Theorem 2.1 (Holevo bound jl4U52( ) . LetY be the random variable describ- 
ing the outcome of some measurement applied to p R for R = {Px(x), P R }xex- 
Then, I(X\Y) < S(p R ) — Px(x)S(p R ), where equality can be achieved if and 
only if {p R }xex are simultaneously diagonalizable. 

Note that if all states in the ensemble are pure and all different then in order to 
achieve equality in the theorem above, they have to form an orthonormal basis 
of the space they span. In this case, the variable Y achieving equality is the 
measurement outcome in this orthonormal basis. 

Dependent Part. The following definition introduces a random variable de- 
scribing the correlation between two random variables X and Y, obtained by 
collapsing all values x\ and x% for which Y has the same conditional distribu- 
tion, to a single value. 

Definition 2.2 (Dependent part j.'Ui! 1 . For two random variables X, Y . let 
fx(x) := Py\x=x ■ Then the dependent part of X with respect to Y is defined 
asX\Y:=f x (X). 

The dependent part X \ Y is the minimum random variable among the random 
variables computable from X for which X <-> X \ Y «-> Y forms a Markov chain 
m- In other words, for any random variable K = f(X) such that X «-> K <-> 
Y is a Markov chain, there exists a function g such that g(K) = X \ Y. 
Immediately from the definition we get several other properties of X Y |3BI : 
H(Y\X \ Y) = H(Y\X), I(X- Y) = I{X \ Y; Y), and X \ Y = X \ {Y \ 
X). The second and the third formula yield I(X; Y) = I(X \Y: Y \ X). 

The notion of dependent part has been further investigated in jldll5I17j . 
Wullschleger and Wolf have shown that quantities H(X \ Y\Y) and H(Y \ 
X\X) are monotones for two-party computation fiTTl . That is, none of these 
values can increase during classical two-party protocols. In particular, if Al- 
ice and Bob start a protocol from scratch then classical two-party protocols 
can only produce (X, Y) such that: H(X \ Y\Y) = H(Y \ X\X) = 0, 
since H(X \ Y\Y) > 0 if and only if H(Y \ X\X) > 0 (Sj. Conversely, 
any primitive satisfying H(X \ Y\Y) = H(Y \ X\X) = 0 can be imple- 
mented securely in the honest-but-curious (HBC) model. We call such primitives 
trivia 0 . 


See Footnote 0 for a caveat about this terminology. 
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Purification. All security questions we ask are with respect to (quantum) 
honest-but- curious adversaries. In the classical honest-but-curious adversary 
model (HBC), the parties follow the instructions of a protocol but store all in- 
formation available to them. Quantum honest-but-curious adversaries (QHBC), 
on the other hand, are allowed to behave in an arbitrary way that cannot be 
distinguished from their honest behavior by the other player. 

Almost all impossibility results in quantum cryptography rely upon a quantum 
honest-but-curious behavior of the adversary. This behavior consists in purifying 
all actions of the honest players. Purifying means that instead of invoking clas- 
sical randomness from a random tape, for instance, the adversary relies upon 
quantum registers holding all random bits needed. The operations to be exe- 
cuted from the random outcome are then performed quantumly without fixing 
the random outcomes. For example, suppose a protocol instructs a party to pick 
with probability p state \(f°) c and with probability 1 — p state \<p l ) c before 
sending it to the other party through the quantum channel C. The purified ver- 
sion of this instruction looks as follows: Prepare a quantum register in state 
V?|0) ii + Vl — p\1)r holding the random process. Add a new register initially in 
state 0) c before applying the unitary transform U : ' r ) K |0) c \r) R \cj) r ) c for 
r G {0, 1}, send register C through the quantum channel and keep register R. 

From the receiver’s point of view, the purified behavior is indistinguishable 
from the one relying upon a classical source of randomness because in both cases, 
the state of register C is p = p\(p°)(<p°\ + (1 — p) |b 1 )(0 1 1. All operations invoking 
classical randomness can be purified similarly |2. v >l21l22fT7j . The result is that 
measurements are postponed as much as possible and only extract information 
required to run the protocol in the sense that only when both players need 
to know a random outcome, the corresponding quantum register holding the 
random coin will be measured. If both players purify their actions then the joint 
state at any point during the execution will remain pure, until the very last step 
of the protocol when the outcomes are measured. 

Secure Two-Party Computation. In Section El we investigate the leakage 
of several universal cryptographic two-party primitives. By universality we mean 
that any two-party secure function evaluation can be reduced to them. We in- 
vestigate the completely randomized versions where players do not have inputs 
but receive randomized outputs instead. Throughout this paper, the term prim- 
itive usually refers to the joint probability distribution defining its randomized 
version. Any protocol implementing the standard version of a primitive (with in- 
puts) can also be used to implement a randomized version of the same primitive, 
with the “inputs” chosen according to an arbitrary fixed probability distribution. 

3 Two-Party Protocols and Their Embeddings 

3.1 Correctness 

In this work, we consider cryptographic primitives providing X to honest player 
Alice and Y to honest player Bob according to a joint probability distribution 
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Px,y- The goal of this section is to define when a protocol n correctly implements 
the primitive Px,y- The first natural requirement is that once the actions of tt are 
purified by both players, measurements of registers A and B in the computational 
basisQ provide joint outcome (A, Y) = (x. y) with probability Px.y(x. y). 

Protocol 7T can use extra registers A' on Alice’s and B' on Bob’s side pro- 
viding them with (quantum) working space. The purification of all actions of 7r 
therefore generates a pure state | ip) € Bab 0 Ba>b' ■ A second requirement for 
the correctness of the protocol n is that these extra registers are only used as 
working space, i.e. the final state | ^aba'b 1 suc h that the content of Alice’s 
working register A 1 does not give her any further information about Bob’s out- 
put Y than what she can infer from her honest output X and vice versa for B' . 
Formally, we require that S(XA r ; Y ) = /(A; Y) and S(X: YB') = /(A; Y) or 
equivalently, that A' «-» X «-> Y and X <-> Y <-► B' form Markov chain^l. 

Definition 3.1. A protocol it for Px,y is correct if measuring registers A and 
B of its final state in the computational basis yields outcomes X and Y with 
distribution Px,y and the final state satisfies S(X-,YB ') = S{XA'\ Y) = I (A; F) 
where A! and B' denote the extra working registers of Alice and Bob. The state 
\tfi) G Bab <8 Ba'b' is called an embedding of Px,y if it can be produced by the 
purification of a correct protocol for Px,y ■ 

We would like to point out that our definition of correctness is stronger than the 
usual classical notion which only requires the correct distribution of the output 
of the honest players. For example, the trivial classical protocol for the primitive 
Px,y in which Alice samples both player’s outputs A Y, sends Y to Bob, but 
keeps a copy of Y for herself, is not correct according to our definition, because 
it implements a fundamentally different primitive, namely Pxy,y- 

3.2 Regular Embeddings 

We call an embedding \fj) aba'b 1 regular if the working registers A' , B' are empty. 
Formally, let & n , m '■= {# ■ (0, 1}" X {0, l} m —♦[()... 27r)} be the set of functions 
mapping bit-strings of length m + n to real numbers between 0 and 27r. 

Definition 3.2. For a joint probability distribution Px,y where X G {0, 1}" 
and Y G {0, l} m , we define the set 

£(Px,y) := \ |V>) € Bab : \i>) - ]T e ie( ™> sj Px,v(x,y)\x, y) AB , 6 £ 6> n , 

7 It is clear that every quantum protocol for which the final measurement (providing 
( x , y) with distribution Px,y to the players) is not in the computational basis can 
be transformed into a protocol of the described form by two additional local unitary 
transformations . 

8 Markov chains with quantum ends have been defined in HD and used in subse- 
quent works such as D3- It is straightforward to verily that the entropic condition 
S(XA'-. Y) = I(X-Y) is equivalent to A' <-> A <-> Y being a Markov chain and 
similarly for the other condition. 
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and call any state \ tp) G £{Pxy) a regular embedding of the joint probability 
distribution Px,y ■ 

Clearly, any \ip) G £(Pxy) produces (A, Y) with distribution Pxy since the 
probability that Alice measures x and Bob measures y in the computational basis 
is \(ip\x,y)\ 2 = Pxy(x, y). In order to specify a particular regular embedding 
one only needs to give the description of the phase function 0{x,y). We denote 
by | ifg) G £(Px,y) the quantum embedding of Pxy with phase function 6. The 
constant function 0(x,y) : = 0 for all x G {0,1}", y G {0,1}"* corresponds to 
what we call canonical embedding |V>o) : = y V Pxy(x, y) \x, y) AB ■ 

In Lemma [PI below we show that every primitive Pxy has a regular embed- 
ding which is in some sense the most secure among all embeddings of Pxy- 


3.3 Trivial Classical Primitives and Trivial Embeddings 

In this section, we define triviality of classical primitives and (bipartite) embed- 
dings. We show that for any non-trivial classical primitive, its canonical quantum 
embedding is also non-trivial. Intuitively, a primitive Pxy is trivial if X and Y 
can be generated by Alice and Bob from scratch in the classical honest-but- 
curious (HBC) mode@. Formally, we define triviality via an entropic quantity 
based on the notion of dependent part (see Section |2J • 

Definition 3.3. A primitive Pxy is called trivial if it satisfies H(X \ Y\Y) = 
0, or equivalently, H(Y \ X\X ) = 0. Otherwise, the primitive is called 
non-trivial. 

Definition 3.4. A regular embedding jp) A n £ £(Pxy) is called trivial if either 
S(X \ Y\B) = 0 or S(Y \ X\ A) = 0. Otherwise, we say that \ip) AB is 
non-trivial. 

Notice that unlike in the classical case, S(X \ Y\B ) = 0 <=> S(Y \ X\ A) = 
0 does not hold in general. As an example, consider a shared quantum state 
where the computational basis corresponds to the Schmidt basis for only one 
of its subsystems, say for A. Let | %p) = a|0) A |£o) B + /?|1 )aI^i)b suc ^ 
both subsystems are two-dimensional, {|£o),|Ci}} 7^ {|0),|1)}, (£o]£i) = 0) an d 
|(£ 0 |0}| 7^ |(£i|0}|. We then have S{X\B) = 0 and 5(F|A) > 0 while X = A \ F 
and Y = Y \ A. 

To illustrate this definition of triviality, we argue in the following that if a 
primitive Pxy has a trivial regular embedding, there exists a classical protocol 
which generates A, Y securely in the HBC model. Let | ip) G £{Pxy) be trivial 
and assume without loss of generality that S(Y \ X\ A) = 0. Intuitively, this 
means that Alice can learn everything possible about Bob’s outcome Y (Y could 
include some private coin-flips on Bob’s side, but that is “filtered out” by the 
dependent part). More precisely, Alice holding register A can measure her part of 


See Footnote 0 for a caveat about this terminology. 
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the shared state to completely learn a realization of Y \ X, specifying P x \y=y 
She then chooses X according to the distribution Px\Y= y ■ An equivalent way of 
trivially generating ( X , Y) classically is the following classical protocol: 

1. Alice samples P x \Y= y ' from distribution Py\x and announces its outcome 
to Bob. She samples x from the distribution P x \y= v '- 

2. Bob picks y with probability P y \y\x=p x y= , ■ 

Of course, the same reasoning applies in case S(X \ Y\B) = 0 with the roles 
of Alice and Bob reversed. 

In fact, the following lemma (whose proof can be found in the full version 
shows that any non-trivial primitive Px,y has a non-trivial embedding, i.e. there 
exists a quantum protocol correctly implementing P x ,y while leaking less infor- 
mation to QHBC adversaries than any classical protocol for Px.v in the HBC 
model. 

Lemma 3.5. If Px,y is a non-trivial primitive then the canonical embedding 
IV’o) G £(Px,y ) is also non-trivial. 

4 The Leakage of Quantum Embeddings 

We formally define the leakage of embeddings and establish properties of the 
leakage. The proofs of all statements in this section can be found in the full 
version E3- 


4.1 Definition and Basic Properties of Leakage 

A perfect implementation of Px,y simply provides X to Alice and Y to Bob and 
does nothing else. The expected amount of information that one random vari- 
able gives about the other is I(X ; Y) = H(X) - H(X\Y) = H(Y ) - H(Y\X) = 
I(Y;X). Intuitively, we define the leakage of a quantum embedding W) aba'b’ 
of Px,y as the larger of the two following quantities: the extra amount of in- 
formation Bob’s quantum registers BB' provide about X and the extra amount 
Alice’s quantum state in AA! provides about Y respectively in comparison to 
“the minimum amount” I{X\ Y)B 

Definition 4.1. Let \ip) G H aba'b' be an embedding of Px.y ■ We define the 
leakage \ip) as 

AtP(Px,y) ~ max {S' (W; BB') - I(X: Y ) , S(AA Y) - I(X: Y)} . 
Furthermore, we say that \ if) is 5-leaking if A^,(Px,y) > 5 . 

10 There are other natural candidates for the notion of leakage such as the difference in 
difficulty between guessing Alice’s output X by measuring Bob’s final quantum state 
B and based on the output of the ideal functionality Y. While such definitions do 
make sense, they turn out not to be as easy to work with and it is an open question 
whether the natural properties described later in this section can be established for 
these notions of leakage as well. 
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It is easy to see that the leakage is non-negative since S(X-, BB') > S{X: B) for B 
the result of a quantum operation applied to BB' . Such an operation could be the 
trace over the extra working register B' and a measurement in the computational 
basis of each qubit of the part encoding Y, yielding S(X: B) = I(X: Y). 

We want to argue that our notion of leakage is a good measure for the privacy 
of the player’s outputs. In the same spirit, we will argue that the minimum 
achievable leakage for a primitive is related to the “hardness” of implementing 
it. We start off by proving several basic properties about leakage. 

For a general state in Haba'B' the quantities S(X: BB') — I(X: Y) and 
S(AA'\ Y) — I(X;Y) are not necessarily equal. Note though that they coincide 
for regular embeddings | ip) G £(Px,y) produced by a correct protocol (where 
the work spaces A' and B' are empty): Notice that S(X: B ) = S(X) + S(B) — 
S(X , B) = H(X) + S(B)-H(X ) = S(B) and because \ip) is pure, S(A) = S(B). 
Therefore, S(X;B) = S(A: Y) and the two quantities coincide. The following 
lemma states that this actually happens for all embeddings and hence, the def- 
inition of leakage is symmetric with respect to both players. 

Lemma 4.2 (Symmetry). Let \ ip) G Haba'B' be an embedding of Px,y- Then, 

A^(P x ,y) = S(X- BB') - I(X ■ Y) = S(AA!\ Y) - 7(X; Y) . 

The next lemma shows that the leakage of an embedding of a given primitive is 
lower-bounded by the leakage of some regular embedding of the same primitive, 
which simplifies the calculation of lower bounds for the leakage of embeddings. 

Lemma 4.3. For every embedding \ip) of a primitive Px,y> there is a regular 
embedding \ ip') of Px,y such that A^Px^y) > A^,i(Px,y)- 

So far, we have defined the leakage of an embedding of a primitive. The natural 
definition of the leakage of a primitive is the following. 

Definition 4.4. We define the leakage of a primitive Px,y as the minimal leak- 
age among all protocols correctly implementing Px,y- Formally, 

A Pxy := min A^(Px y) , 

1-0) 

where the minimization is over all embeddings \ip) of Px,y- 

Notice that the minimum in the previous definition is well-defined, because by 
Lemma 14.31 it is sufficient to minimize over regular embeddings \tp) G £(Px.y)- 
Furthermore, the function A^,(Px,y) is continuous on the compact (i.e. closed 
and bounded) set [0, 2'K]\ Xxy \ of complex phases corresponding to elements 
\ x iV)ab i n formula for | if) AB G £(Px,y) and therefore it achieves 
its minimum. 

The following theorem shows that the leakage of any embedding of a prim- 
itive Px,y is lower-bounded by the minimal leakage achievable for primitive 
Px\y,y\x (which due to Lemma H.3I is achieved by a regular embedding). 
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Theorem 4.5. For any primitive Px,y, Ap XY > Ap x ..^ YY ^ x . 

Proof (Sketch). The proof idea is to pre-process the registers storing X and Y 
in a way allowing Alice and Bob to convert a regular embedding of Px,Y (for 
which the minimum leakage is achieved) into a regular embedding of Px\y,y\x 
by measuring parts of these registers. It follows that on average, the leakage of 
the resulting regular embedding of P X \,y,y\x is at most the leakage of the 
embedding of Px,y the players started with. Hence, there must be a regular 
embedding of Px\y,y\x leaking at most as much as the best embedding of 
Px,Y ■ See ESI for the complete proof. □ 

4.2 Leakage as Measure of Privacy and Hardness of Implementation 

The main results of this section are consequences of the Holevo bound 
(Theorem 12. Ill . 

Theorem 4.6. If a two-party quantum protocol provides the correct outcomes 
of Px,y to the players without leaking extra information, then Px,y must be a 
trivial primitive. 

Proof. Theorem 14 . 51 implies that if there is a 0-leaking embedding of Px.y than 
there is also a 0-leaking embedding of Px\y,y\x- Let us therefore assume 
that - 0 ) is a non- leaking embedding of Px.y such that X = X \ Y and Y = 
Y \ X. We can write 1 0) in the form |0) = Y) x \/ Px{x)\x)\Wx) and get pp 
= Y) x Px(x)\p x )(ip x \. For the leakage of | ip) we have: A^(P x ,y) = S(X;B) — 
I(X; Y) = S(pb) ~ I{X\ Y) = 0. From the Holevo bound (Theorem 12. Ill follows 
that the states {\<p x )}x form an orthonormal basis of their span (since X = X \ 
Y , they are all different) and that Y captures the result of a measurement in 
this basis, which therefore is the computational basis. Since Y = Y \ X, we get 
that for each x, there is a single y x G y such that | ip x ) = \y x ). The primitives 
Px\y,y\x and Px.y are therefore trivial. □ 

In other words, the only primitives that two-party quantum protocols can imple- 
ment correctly (without the help of a trusted third party) and without leakage 
are the trivial ones! We note that it is not necessary to use the strict notion of 
correctness from Definition Id. II in this theorem, but a more complicated proof 
can be done solely based on the correct distribution of the values. This result 
can be seen as a quantum extension of the corresponding characterization for the 
cryptographic power of classical protocols in the HBC model. Whereas classical 
two-party protocols cannot achieve anything non-trivial, their quantum counter- 
parts necessarily leak information when they implement non-trivial primitives. 

The notion of leakage can be extended to protocols involving a trusted third 
party (see |SS|). A special case of such protocols are the ones where the players 
are allowed one call to a black box for a certain non-trivial primitive. It is 
natural to ask which primitives can be implemented without leakage in this case. 
As it turns out, the monotones H(X \ Y\Y) and H(Y \ X\X), introduced 
in are also monotones for quantum computation, in the sense that all joint 


On the Power of Two-Party Quantum Cryptography 


83 


random variables X' . Y' that can be generated by quantum players without 
leakage using one black-box call to Px,y satisfy H(X' \ Y'\Y') < H(X \ Y\Y) 
and H(Y' \ X'\X’) < H(Y \ X\ X). 

Theorem 4.7. Suppose that primitives Px,y andPx' ,y' satisfy H{X' \ Y'\Y')> 
H{X \ Y\Y) or H(Y' \ X'\X') > H{Y \ X\X). Then any implementation of 
Px>,y> using just one call to the ideal functionality for Px,y leaks information. 

4.3 Reducibility of Primitives and Their Leakage 

This section is concerned with the following question: Given two primitives Px,y 
and Px',Y' such that Px,y is reducible to Px',Y', what is the relationship be- 
tween the leakage of Px.y and the leakage of Px’,y '? We use the notion of 
reducibility in the following sense: We say that a primitive Px,y is reducible in 
the HBC model to a primitive Px',y ' if Px.y can be securely implemented in 
the HBC model from (one call to) a secure implementation of Px‘,y >• The above 
question can also be generalized to the case where Px,y can be computed from 
Px',Y’ only with certain probability. Notice that the answer, even if we assume 
perfect reducibility, is not captured in our previous result from Lemma 1-01 since 
an embedding of Px',Y' is not necessarily an embedding of Px.y (it might vi- 
olate the correctness condition). However, under certain circumstances, we can 
show that Ap x , , > Ap XY . 

Theorem 4.8. Assume that primitives Pxy and Px'y = Px; j x[,y,' 1 y( satisfy 
the condition: 

^2 Pxi_,Y{(x,y) > 1 - 6, 

X’V-Px^x'ix'^xl^v-PX’Y 

where the relation ~ means that the two distributions are equal up to relabeling 
of the alphabet. Then, A Px , y , > (1 S)A Pxy . 

This theorem allows us to derive a lower bound on the leakage of l-out-of-2 
Oblivious Transfer of r-bit strings in Section 0 

5 The Leakage of Universal Cryptographic Primitives 

In this section, we exhibit lower bounds on the leakage of some universal two- 
party primitives. In the following table, ROT r denotes the r-bit string version 
of randomized Rabin OT, where Alice receives a random r-bit string and Bob 
receives the same string or an erasure symbol, each with probability 1/2. Sim- 
ilarly, l-2-OT r denotes the string version of 1-2-ot, where Alice receives two 
r-bit strings and Bob receives one of them. By 1-2-ot p we denote the noisy 
version of 1-2-OT, where the 1-2-OT functionality is implemented correctly only 
with probability 1 — p. Table Q summarizes the lower bounds on the leakage of 
these primitives (the derivations can be found in the full version [THl l. We note 
that Wolf and Wullschleger IHE! have shown that a randomized 1-2-ot can be 
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Table 1. Lower bounds on the leakage for universal two-party primitives 


primitive 

leaking at least 

comments 

ROT 1 

(Hz) - 1) ~ 0.311 

same leakage for all regular embeddings 

ROT r 

(1 - 0(r 2~ r )) 

same leakage for all regular embeddings 

1-2-OT, SAND 


minimized by canonical embedding 

l-2-OT r 

(l-0(r2" r )) 

(suboptimal) lower bound 

1-2-OTp 


ifp < sin 2 (7r/8) « 0.15, (suboptimal) lower bound 


transformed by local operations into an additive sharing of an AND (here called 
sand). Therefore, our results for 1-2-ot below also apply to sand. 

l-2-OT r and 1-2-OTp are primitives where the direct evaluation of the leakage 
for a general embedding \ipg) is hard, because the number of possible phases 
increases exponentially in the number of qubits. Instead of computing S(A ) 
directly, we derive (suboptimal) lower bounds on the leakage. 

Based on the examples of ROT r and 1-2-ot, it is tempting to conjecture that 
the leakage is always minimized for the canonical embedding, which agrees with 
the geometric intuition that the minimal pairwise distinguishability of quantum 
states in a mixture minimizes the von Neumann entropy of the mixture. However, 
Jozsa and Schlienz have shown that this intuition is sometimes incorrect ra- 
in a quantum system of dimension at least three, we can have the following 
situation: For two sets of pure states {la*}}!*! and {|wi)}"=i satisfying |(u*|uj)| < 
\{vi\vj}\ for all i,j, there exist probabilities p t such that for p u := J2i-i Pi\ui)(ui\, 
p v := T^iO^K^l- ^ holds that S(p u ) < S(p v ). As we can see, although each 
pair | Ui), | Uj) is more distinguishable than the corresponding pair \vi), |nj), the 
overall p u provides us with less uncertainty than p v . It follows that although 
for the canonical embedding \ip 0 ) = WiMv) of Px.y the mutual overlaps 
\{py\tp y t)\ are clearly maximized, it does not necessarily imply that S(A) in 
this case is minimal over £(Px,y)- It is an interesting open question to find a 
primitive whose canonical embedding does not minimize the leakage or to prove 
that no such primitive exists. 

For the primitive Py P Y , our lower bound on the leakage only holds for p < 
sin 2 (7r/8) sa 0.15. Notice that in reality, the leakage is strictly positive for any 
embedding of Pxy with p < 1/4, since for p < 1/4, P'x’y is a non-trivial 
primitive. On the other hand, P°xy i s a trivial primitive implemented securely 
by the following protocol in the classical HBC model: 

1. Alice chooses randomly between her input bits xo and xi and sends the 
chosen value x a to Bob. 

2. Bob chooses his selection bit c uniformly at random and sets y := x a . 

Equality x c = y is satisfied if either a = c, which happens with probability 
1/2, or if a ^ c and x„ = Xi - a , which happens with probability 1/4. Since the 
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two events are disjoint, it follows that x c = y with probability 3/4 and that 
the protocol implements P x y 4 • The implementation is clearly secure against 
honest-but-curious Alice, since she does not receive any message from Bob. It 
is also secure against Bob, since he receives only one bit from Alice. By letting 
Alice randomize the value of the bit she is sending, the players can implement 
P"xy securely for any value 1/4 < p < 1/2. 

6 Conclusion and Open Problems 

We have provided a quantitative extension of qualitative impossibility results 
for two-party quantum cryptography. All non-trivial primitives leak information 
when implemented by quantum protocols. Notice that demanding a protocol to 
be non-leaking does in general not imply the privacy of the players’ outputs. 
For instance, consider a protocol implementing 1-2-OT but allowing a curious 
receiver with probability \ to learn both bits simultaneously or with probability 
\ to learn nothing about them. Such a protocol for 1-2-ot would be non-leaking 
but nevertheless insecure. Consequently, Theorem 14. HI not only tells us that any 
quantum protocol implementing a non-trivial primitive must be insecure, but 
also that a privacy breach will reveal itself as leakage. Our framework allows to 
quantify the leakage of any two-party quantum protocol correctly implementing 
a primitive. The impossibility results obtained here are stronger than standard 
ones since they only rely on the cryptographic correctness of the protocol. Fur- 
thermore, we present lower bounds on the leakage of some universal two-party 
primitives. 

A natural open question is to find a way to identify good embeddings for a 
given primitive. In particular, how far can the leakage of the canonical embedding 
be from the best one? Such a characterization, even if only applicable to special 
primitives, would allow to lower bound their leakage and would also help to 
understand the power of two-party quantum cryptography in a more concise way. 

It would also be interesting to find a measure of cryptographic non-triviality 
for two-party primitives and to see how it relates to the minimum leakage of any 
implementation by quantum protocols. For instance, is it true that quantum 
protocols for primitive Px,y leak more if the minimum (total variation) distance 
between Px,y and any trivial primitive increases? 

Another question we leave for future research is to define and investigate other 
notions of leakage, e.g. in the one-shot setting instead of in the asymptotic regime 
(as outlined in Footnote El ■ Results in the one-shot setting have already been 
established for data compression EDI, channel capacities EH , state-merging E3H 
and other (quantum-) information-theoretic tasks. 

Furthermore, it would be interesting to find more applications for the concept 
of leakage, considered also for protocols using an environment as a trusted third 
party. In this direction, we have shown in Theorem 11. 71 t hat any two-party quan- 
tum protocol for a given primitive, using a black box for an “easier” primitive, 
leaks information. Lower-bounding this leakage is an interesting open question. 
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We might also ask how many copies of the “easier” primitive are needed to 

implement the “harder” primitive by a quantum protocol, which would give us 

an alternative measure of non-triviality of two-party primitives. 
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Abstract. Code-based cryptography is often viewed as an interesting 
“Post-Quantum” alternative to the classical number theory cryptogra- 
phy. Unlike many other such alternatives, it has the convenient advan- 
tage of having only a few, well identified, attack algorithms. However, 
improvements to these algorithms have made their effective complexity 
quite complex to compute. We give here some lower bounds on the work 
factor of idealized versions of these algorithms, taking into account all 
possible tweaks which could improve their practical complexity. The aim 
of this article is to help designers select durably secure parameters. 

Keywords: computational syndrome decoding, information set decod- 
ing, generalized birthday algorithm. 


Introduction 

Code-based cryptography has received renewed attention with the recent interest 
for “Post-Quantum Cryptography” (see for instance jSj). Several new interesting 
proposals have been published in the last few months f.'ll'il III . For those new 
constructions as well as for previously known code-based cryptosystems, precise 
parameters selection is always a sensitive issue. Most of the time the most threat- 
ening attacks are based on decoding algorithms for generic linear codes. There 
are two main families of algorithms, Information Set Decoding (ISD), and Gen- 
eralized Birthday Algorithm (GBA). Each family being suited for some different 
parameter ranges. 

ISD is part of the folklore of algorithmic coding theory and is among the most 
efficient techniques for decoding errors in an arbitrary linear code. One major 
step in the development of ISD for the cryptanalysis of the McEliece encryption 
scheme is Stern’s variant m which mixes birthday attack with the traditional 
approach. A first implementation description 1 1 ( )j . with several improvements, 
led to an attack of 2 64 2 binary operations for the original McEliece parameters, 
that is decoding 50 errors in a code of length 1024 and dimension 524. More re- 
cently ©, a new implementation was proposed with several new improvements, 
with a binary workfactor of 2 60 " 5 . Furthermore, the authors report a real attack 
(with the original parameters) with a computational effort of about 2 58 CPU 
cycles. The above numbers are accurate estimates of the real cost of a decoding 
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attack. They involve several parameters that have to be optimized and further- 
more, no close formula exists, making a precise evaluation rather difficult. 

GBA was introduced by Wagner in 2002 (2SJ but was not specifically designed 
for decoding. Less generic version of this algorithm had already been used in 
the past for various cryptanalytic applications 0HI- Its first successful use to 
cryptanalyse a code-based system is due to Coron and Joux m In particular, 
this work had a significant impact for selecting the parameters of the FSB hash 
function p. 

Most previous papers on decoding attacks were written from the point of view 
of the attacker and were looking for upper bounds on the work factor of some 
specific implementation. One exception is the asymptotic analysis for ISD that 
has been recently presented in [SJ . Here we propose a designer approach and we 
aim at providing tools to easily select secure parameters. 

For both families, we present new idealized version of the algorithms, which 
encompass all variants and improvements known in cryptology as well as some 
new optimizations. This allows us to give easy to compute lower bounds for 
decoding attacks up to the state of the art. 

We successively study three families of algorithms, first the “standard” birth- 
day attack, then two evolutions of this technique, namely Stern’s variant of infor- 
mation set decoding and Wagner’s generalized birthday algorithm. In each case 
we propose very generic lower bounds on their complexity. Finally, we illustrate 
our work with case studies of some of the main code-based cryptosystems. 

1 The Decoding Problem in Cryptology 

Problem 1 (Computational Syndrome Decoding - CSD). Given a matrix 
H £ {0, l} rXn , a word s £ {0, l} r and an integer w > 0, find a word e £ {0, 1}" 
of Hamming weight < w such that e,H T = s. 

We will denote CSD (H, s, w ) an instance of that problem. It is equivalent to 
decoding w errors in a code with parity check matrix H. The decision problem 
associated with computational syndrome decoding, namely, Syndrome Decoding, 
is NP-complete 0. 

This problem appears in code-based cryptography and for most systems it is 
the most threatening known attack (sometimes the security can be reduced to 
CSD alone |ll2.'fj h Throughout the paper we will denote 

W n , w = {e £ {0, 1}" | wt(e) = w} 

the set of all binary words of length n and Hamming weight w. The instances 
of CSD coming from cryptology usually have solutions. Most of the time, this 
solution is unique. This is the case for public-key encryption schemes jEZEP or 
for identification schemes However, if the number w of errors is larger 

than the Gilbert- Varshamov distanced] we may have a few, or even a large num- 
ber, of solutions. Obtaining one of them is enough. This is the case for digital 
signatures m or for hashing m- 

1 The Gilbert- Varshamov distance is the smallest integer do such that (J‘ ) > 2 r . 
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2 The Birthday Attack for Decoding 

We consider an instance CSD (H, s, us) of the computational syndrome decoding. 
If the weight w is even, we partition the columns of H in two subsets (a priori 
of equal size). For instance, let H = (Hi \ H 2 ) and let us consider the sets 
Ci = {ei iff | ei G W„/ 2)UJ / 2 } and C 2 = {s + e 2 H 2 | e 2 G W n / 2<w / 2 }. Any 
element of C% Cl C 2 provides a pair (ei,e 2 ) such that e.\H\ = s + e 2 H 2 and 
e\ + e 2 is a solution to CSD (H,s,w). This collision search has to be repeated 
1 /Pr„,,„ times on average where Pr n> ,„ is the probability that one of the solutions 
splits evenly between the left and right parts of H. Let C n>rtW denote the total 
number of columns sums we have to compute. If the solution is unique, we 


havcQ 




I&I + I&I 2 C) 

Pr »- O 


Pr„ jU , = and C n , r , w 


2 , 


This number is close to the improvement expected when the birthday paradox 
can be applied ( i.e . replacing an enumeration of N elements by an enumeration 
of 2i/N elements). In this section, we will show that the factor {/ ttw / 2 can be 
removed and that the formula often applies when w is odd. We will also provide 
cost estimations and bounds. 

2.1 A Decoding Algorithm Using the Birthday Paradox 

The algorithm presented in Table □ generalizes the birthday attack for decoding 
presented above. For any fixed values of n, r and us this algorithm uses three 
parameters (to be optimized): an integer l and two sets of constant weight words 
Wi and W 2 . 

The idea is to operate as much as possible with partial syndromes of size l < r 
and to make the full comparison on r bits only when we have a partial match. 
Increasing the size of W\ (and W 2 ) will lead to a better trade-off, ideally with a 
single execution of (main loop). 

Definition 1 . For any fixed value of n, r and w, we denote WFha (n. r. us) the 
minimal binary work factor (average cost in binary operations) of the algorithm 
of Table Q to produce a solution to CSD, for any choices of parameters W\, W 2 
and i. 

An Estimation of the Cost. We will use the following assumptions (discussed 
in appendix): 

(Bl) For all pairs (ei,e 2 ) examined in the algorithm, the sums e\ + e 2 are 
uniformly and independently distributed in W n ,w 


2 We use Stirling’s formula to approximate factorials. The approximation we give is 
valid because w <C n. 
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Table 1. Birthday decoding algorithm 


For any fixed values of n, r and w, the following algorithm use 

3S three param- 

eters: an integer l > 0, Wi C W„, [ w / 2 j and W 2 C W n> ^ w /z\ ■ 
he( x) the first £ bits of any x € {0, 1}”. 

We denote by 

procedure BirthdayDecoding 
input: H 0 € {0, l} rXn , s £ {0, 1}” 


repeat 

(main loop) 

P <— random n x n permutation matrix 

H^HoP 

for all e € Wi 


i <- h e {eH T ) 

(ba 1) 

write(e,i) // store e in some data structure at index i 

for all e 2 G W 2 


i <— he(s + e 2 H T ) 

(BA 2) 

S «— read(i) // extract the elements stored at index i 

for all ei G S 


if eiH T = s + e 2 H T 

(ba 3) 

return (ei + e 2 )P T 

(success) 


(B2) The cost of the execution of the algorithm is approximatively equal to 

£ ■ D(ba 1) + £ ■ (((BA 2) + K 0 ■ |J(ba 3), (1) 

where K 0 is the cost for testing ei H T = s + e 2 H T given that hi{e\H T ) = 
he(s + e 2 H T ) and D(ba i) is the expected number of execution of the 
instruction (ba i) before we meet the (success) condition. 

Proposition 1. Under assumptions (BUD and (BED- We /tave0 

WF BA (n, r, w ) w 2 L log (K 0 L) with L = min (y^) , 2 r / 2 ) 
and Kq is the cost for executing the instruction (ba 3) ( i.e. testing e.H T = s). 


Remarks 

1. When > 2 r , the cost will depend of the number of syndromes 2 r instead 
of the number of words of weight w. This corresponds to the case where w is 
larger than the Gilbert- Varshamov distance and we have multiple solutions. 
We only need one of those solutions and thus the size of the search space is 
reduced. 


Here and after, “log” denotes the base 2 logarithm (and “In” the Neperian 
logarithm). 
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2. It is interesting to note the relatively low impact of Kq, the cost of the 
test in (ba 3). Between an extremely conservative lower bound of K 0 = 2, 
an extremely conservative upper bound of Kq = wr and a more realistic 
K 0 = 2w> the differences are very small. 

3. In the case where w is odd and ( | . u ” 2 j ) < L, the formula of Proposition 0 is 
only a lower bound. A better estimate would be 

WF BA (n, r, w) w 2 U log ( K 0 ^) with L' = ■ (2) 

V L 1 2 \\_w/2\) 

4. Increasing the size of \W\\ (and W 2 1 ) can be easily and efficiently achieved 
by “overlapping” Hi and H 2 (see the introduction of this section). More 
precisely, we take for W\ all words of weight w/2 using only the n' first 
coordinates (with n/2 < n' < n). Similarly, W 2 will use the n' (or more) last 
coordinates. 

2.2 Lower Bounds 

As the attacker can make a clever choice of Wi and W 2 which may contradict as- 
sumption (BP) , we do not want to use it for the lower bound. The result remains 
very close to the estimate of the previous sections except for the multiplicative 
constant which is \/2 instead of 2. 

Theorem 1. For any fixed value of n, r and w, we have 
WF BA(n,r,w) > V2Llog(K 0 L) with L = min 
where Ko is the cost for executing the instruction (ba 3). 

3 Information Set Decoding (ISD) 

We will consider here Stern’s algorithm which is the best known decoder 
for cryptographic purposes, and some of its implemented variants by Canteaut- 
Chabaud P3 and Bernstein-Lange-Peters jSJ Our purpose is to present a lower 
bound which takes all known improvements into account. 

3.1 A New Variant of Stern’s Algorithm 

Following other works |1 fill tij . J. Stern describes in j22| an algorithm to find 
a word of weight w in a binary linear code of length n and dimension k (and 
codimension r = n — k). The algorithm uses two additional parameters p and 
l (both positive integers). We present here a generalized version which acts on 
the parity check matrix Hq of the code (instead of the generator matrix). Table [21 
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Table 2. Generalized ISD algorithm 


For any fixed values of n, r and w, the following algorithm uses four pa- 
rameters: two integers p > 0 and £ > 0 and two sets Wi C Wk,+t.\ p /2\ and 
Wa C Wk+i, [p/21 • We denote by hi(x) the last l bits of any x £ {0, l} r . 

procedure ISDecoding 

input: H 0 £ {0, l} r ' x ", s 0 £ {0, l} r 

repeat (MAIN LOOP) 

P <— random n X n permutation matrix 

(H', U ) <— PGElim(ifo-P) // partial elimination as in Q 

s «- s 0 U T 

for all e € Wi 

i <- hi(eH' T ) (ISD 1) 

write(e, i) // store e in some data structure at index i 

for all ea £ Wa 

i ^ he{s + e 2 H' T ) (isd 2) 

S <— read(i) // extract the elements stored at index i 

for all ei £ S 

if wt (s -I- (ei + ea)H ,T ) = w — p (isd 3) 

return ( P , ei + e 2 ) (SUCCESS) 


describes the algorithm. The partial Gaussian elimination of HqP consists in 
finding U (r x r and non-singular) and H (and H') such that[] 


UH 0 P = H = 
i 



(3) 


where U is a non-singular r X r matrix. Let s = sqU t . If e is a solution of 
CSD(iL, s, w) then eP T is a solution of CSD(iL 0 , so, w). Let (P, e') be the output 
of the algorithm, i.e., wt(s + e'H ,T ) = w — p, and let e" be the first r — t bits of 
s + e'H' T , the word e = (e" | e') is a solution of CSD(P, s, w). 


Definition 2. For any fixed value of n, r and w, we denote WFisd(«, r, w) the 
minimal binary work factor (average cost in binary operations) of the algorithm of 
Table[]\to produce a solution to CSD, for any choices of parameters l, p, W\ and W 2 ■ 


3.2 Estimation of the Cost of the New Variant 

To evaluate the cost of the algorithm we will assume that only the instructions 
(ISD i) are significant. This assumption is stronger than for the birthday attack, 

4 In the very unlikely event that the first r — l columns are linearly dependent, we can 
change P . 


94 


M. Finiasz and N. Sendrier 


because it means that the Gaussian elimination at the beginning of every (main 
loop) costs nothing. It is a valid assumption as we only want a lower bound. 
Moreover, most of the improvements introduced in jlOlfij are meant to reduce 
the relative cost of the Gaussian elimination. We claim that within this “free 
Gaussian elimination” assumption any lower bound on the algorithm of Table [21 
will apply on all the variants of [Elfflj . Our estimations will use the following 
assumptions: 

(11) For all pairs (ei,e 2 ) examined in the algorithm, the sums e,\ + e 2 are uni- 
formly and independently distributed in Wk+i. P - 

( 12 ) The cost of the execution of the algorithm is approximatively equal to 

l ■ tt(iSD 1) + i ■ t ) (isd 2) + K w -p • tt(lSD 3), (4) 

where K w _ p is the average cost for checking wt (s + ( e\ + e 2 )H' T ) = w — p 
and fj (isd i) is the expected number of executions of the instruction (isd 
i) before we meet the (success) condition. 

Proposition 2. Under assumptions «n> and (©.//(”)< 2” (single solution) 
or if (^) > 2 r (multiple solutions) and ( w r _^)(p) 2 r , we have (we recall that 

k = n-r) 


WFisd(u, r, w) 


2^ min ((^),2 T ') 

T Mr4)\AT) 


with t = log (Kw-pftf)) 


with A = 1 — e 1 « 0.63. If > 2 r (multiple solutions) and ( w r _ p ) (*) > 2 r , 
have 


2/’2 r / 2 / \ 

WFi SD (n, r, w ) « min == with i = log (K w - P , 2 ) . 

r v 


we 


Remarks 

1. For a given set of parameters the expected number of execution of (main 
loop) is N = 1/(1 - exp(-X)) where X = (”+^)( fc +*)/min(2 T -, (”)). 

2. The second formula applies when X > 1, that is when the expected number 
of execution of (main loop) is (not much more than) one. In that case, as 
for the birthday attack, the best strategy is to use W 2 = Wk+t t \ p / r 2 \ (ie. as 
large as possible) and W\ is as small as possible but large enough to have 
only one execution of (MAIN LOOP) with probability close to 1. 

3. When X <C 1, we have N = 1/(1 — exp(— X)) « 1 / X and the first formula 
applies. 

4. When X < 1, the first formula still gives a good lower bound. But it is less 
tight when X gets closer to 1. 

5. When p is small and odd the above estimates for WFisd are not always 

accurate. The adjustment is similar to what we have in Q (see the remarks 
following the birthday decoder estimation). In practice, if < \J 

it is probably advisable to discard this odd value of p. 
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6. We use the expression l = log (K W _ P L P ( 0)) for the optimal value of i (where 
Lp(£) = yj ( k p e ) or L p (t) = respectively in the first case or in 

the second case of the Proposition). In fact a better value would be a fixpoint 
of the mapping 1 1 — > L p (i). In practice L p ( 0) is a very good approximation. 


3.3 Gain Compared with Stern’s Algorithm 

Stern’s algorithm corresponds to a complete Gaussian elimination and to a par- 
ticular choice of W\ and W% in the algorithm of Table El A full Gaussian elimi- 
nation is applied to the permuted matrix HqP and we get U and H' such that: 

r k 


UH 0 P = H = 


The Gbit collision search is performed on k columns, moreover p is always even 
and Wi and W 2 will use p/2 columns of Hi and H 2 - The variants presented 
in KillOj consist in reducing the cost of the Gaussian elimination, or, for the 
same H', to use different “slices” (Hi \ H 2 ) of i rows. All other improvements 
lead to an operation count which is close to what we have in (£IJ . The following 
formula, obtained with the techniques of the previous section, gives a tight lower 
bound all those variants. 

WFstern(n, r, w) Pt min With i = log (i^- P (^)) • 

U-pJ Ip/ 2 ) 

The gain of the new version of ISD is « Xy/np/2 which is rather small in 
practice and correspond to the improvement of the “birthday paradox” part of 
the algorithm. 

4 Generalized Birthday Algorithm (GBA) 

4.1 General Principle 

The generalized birthday technique is particularly efficient for solving Syndrome 
Decoding-like problems with a large number of solutions. Suppose one has to 
solve the following problem: 

Problem 2. Given a function f : N i— > {0, l} r and an integer a, find a set of 
2“ indexes Xi such that: 

© f( x i ) = 0. 
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In this problem, / will typically return the Xi - th column of a binary matrix H. 
Note that, here, / is defined upon an infinite set, meaning that there are an 
infinity of solutions. To solve this problem, the Generalized Birthday Algorithm 
(GBA) does the following: 

- build 2“ lists A 0 , . . . , A 2 <»_i, each containing pm different vectors f(xi) 

- pairwise merge lists L^j and A 2 j+i to obtain 2 a ~ 1 lists A' of XORs of 2 

vectors Only keep XORs of 2 vectors starting with zeros. On 

average, the lists A'- will contain 2“+* elements. 

- pairwise merge the new lists L 2 - and L' 2 - +l to obtain 2 a_2 lists A" of XORs 
of 4 vectors f(xi). Only keep XORs of 4 vectors starting with 2-jpy zeros. 
On average, the lists A" will still contain 2«+ I elements. 

- continue these merges until only 2 lists remain. These 2 lists will be composed 
of 2 a + 1 XORs of 2 a ~ 1 vectors f(xi) starting with (a — l)^j zeros. 

- as only 2^-j- bits of the previous vectors are non-zero, a simple application of 
the standard birthday technique is enough to obtain 1 solution (on average). 

As all the lists manipulated in this algorithm are of the same size, the com- 
plexity of the algorithm is easy to compute: 2“ — 1 merge operations have to 
be performed, each of them requiring to sort a list of size 2“+ r . The complexity 
is thus 0( 2 a ^2“+ T ). For simplicity we will only consider a lower bound of the 
effective complexity of the algorithm: if we denote by A the size of the largest 
fist in the algorithm, the complexity is lower-bounded by 0(Alog A), this gives 
a complexity of 0(^-2^). 

Minimal Memory Requirements. The minimal memory requirements for 
this algorithm are not as easy to compute. If all the lists are chosen to be of the 
same size (as in the description of the algorithm we give), then it is possible to 
compute the solution by storing at most a lists at a time in memory. This gives 
us a memory complexity of 0(a2“+i). However, the starting fists can also be 
chosen of different sizes so as to store only smaller lists. 

In practice, for each merge operation, only one of the two lists has to be stored 
in memory, the second one can always be computed on the fly. As a consequence, 
looking at the tree of all merge operations (see Fig. GJ, half the lists of the tree 
can be computed on the fly (the lists in dashed line circles). Let A = 2“+ I and 
suppose one wants to use the Generalized Birthday Algorithm storing only lists 
of size j for a given A. Then, in order to get, on average, a single solution in the 
end, the lists computed on the fly should be larger. For instance, in the example 
of Fig. Q] one should have: 

- | A" | = AA, | A3 1 = A 2 A, and |A 7 | = A 3 A, 

- | A^ | = A and |A 3 | = XL, 

- | Ai| = A and |A 5 | = A. 

In the general case this gives us a time/memory tradeoff when using GBA: 
one can divide the memory complexity by A at the cost of an increase in time 
complexity by a factor A a . However, many other combinations are also possible 
depending on the particular problem one has to deal with. 
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Fig. 1 . Merge operations in the Generalized Birthday Algorithm. All lists in dashed 
line circles can be computed on the fly. 


4.2 GBA under Constraints 


In the previous section, we presented a version of GBA where the number of 
vectors available was unbounded and where the number of vectors to XOR was 
a power of 2. In practice, when using GBA to solve instances of the CSD problem 
only n different r-bit vectors are available and w can be any number. We thus 
consider an idealized version of GBA so as to bound the complexity of “real 
world” GBA. The bounds we give are not always very tight. See for instance jZJ 
for the analysis of a running implementation of GBA under realistic constraints. 

If w is not a power of 2, some of the starting lists Lj should contain vectors 
f(xi) and others XORs of 2 or more vectors f(xi). We consider that the starting 
lists all contain XORs of ^ vectors f(xi), even if this is not an integer. This 
will give the most time efficient algorithm, but will of course not be usable in 
practice. 

The length of the matrix n limits the size of the starting lists. For GBA to 
find one solution on average, one needs lists Lj of size 2 »+ l . As the starting lists 
contain XORs of ^ vectors, we need ( ) > 2 “+ 1 . However, this constraint on 
a is not sufficient: if all the starting lists contain the same vectors, all XORs will 
be found many times and the probability of success will drop. To avoid this, we 
need lists containing different vectors and this can be done by isolating the first 
level of merges. 


- first we select 2 a_1 distinct vectors Sj of a bits such that ® Sj = 0. 

— then we pairwise merge lists L%j and £2.3+1 to obtain lists £' containing 
elements having their a first bits equal to Sj. 


After this first round, we have 2“ _1 lists of XORs of vectors such that, if we 
XOR the a first bits of one element from each list we obtain 0. Also, all the lists 
contain only distinct elements, which means we are back in the general case of 
GBA, except we now have 2 a_1 lists of vectors of length r — a. These lists all 
have a maximum size L = ^r( 2 " ) and can be obtained from starting lists Lj of 


size ) ( 


see Sect. El). We get the following constraint on a: 
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(a) (b) (c) 


Fig. 2. Logarithm of the complexity of the Generalized Birthday Algorithm for given 
n and r when w varies, (a) with no optimization, (b) when the lists are initialized with 
shortened vectors, and (c) when a is not an integer. 


In practice, after the first level of merges we are not exactly in the general 
case of GBA: if, for example, s‘o © sq = s 2 © S3, after the second merges, lists 
and L” would contain exactly the same elements. This can be avoided by using 
another set of target values s' such that ® s' = 0 for the second level of merges 
(as for the first level) and so on for the subsequent levels of merges (except the 
last two levels). 

Using Non-Integer Values for a. Equation o determines the largest pos- 
sible value of a that can be used with GBA. For given n and r, if w varies, 
the complexity of the algorithm will thus have a stair-like shape (see Fig. El a)). 
The left-most point of each step corresponds to the case where Equation 0) is 
an equality. However, when it is not an equality, it is possible to gain a little: 
instead of choosing values Sj of a bits one can use slightly larger values and 
thus start the second level of merge with shorter vectors. This gives a broken- 
line complexity curve (see Fig. 0b)). This is somehow similar to what Minder 
and Sinclair denote by “extended fc-tree algorithm” [03 . In practice, this is al- 
most equivalent to using non-integer values for a (see Fig. 0 c)). We will thus 
assume that in GBA, a is a real number, chosen such that Equation 0 is an 
equality. 

Proposition 3. We can lower bound the binary work factor WFgba (n,r, w) of 
GBA applied to solving an instance of CSD with parameters (n, r, w ) by: 

WFqba {n,r,w) > - — with a such that = ■ 

Note that this gives us a bound on the minimal time complexity of GBA but 
does not give any bound on the memory complexity of the algorithm. Also, this 
bound is computed using an idealized version of the algorithm: one should not 
expect to achieve such a complexity in practice, except in some cases where a is 
an integer and w a power of 2. 


Security Bounds for the Design of Code-Based Cryptosystems 


99 


5 Case Studies 

Now that we have given some bounds on the complexities of the best algorithms 
to solve CSD problems, we propose to study what happens when using them to 
attack existing constructions. 

Note that in this section, as in the whole paper, we only consider the resis- 
tance to decoding attacks. Code-based cryptosystems may also be vulnerable 
to structural attacks. However, no efficient structural attack is known for bi- 
nary Goppa codes (McEliece encryption and CFS signature) or for prime order 
random quasi-cyclic codes (FSB hash function). 


5.1 Attacking the McEliece Cryptosystem 

In the McEliece m and Niederreiter m cryptosystems the security relies on two 
different problems: recovering the private key from the public key and decrypting 
an encrypted message. Decrypting consists in finding an error pattern e of weight 
w, such that e x H T = c where H is a binary matrix derived from the public key 
and c is a syndrome derived from the encrypted message one wants to decrypt. 
Here, we suppose that the structural attack consisting in recovering the private 
key is infeasible and can assume that H is a random binary matrix. Decryption 
thus consists in solving an instance of the CSD problem where one knows that 
one and only one solution exists. 

Having a single solution rules out any attempt to use GBA, or at least, any 
attempt to use GBA would consist in using the classical birthday attack. For this 
reasons the best attacks against the McEliece and Niederreiter cryptosystems 
are all based on ISD. Table 01 gives the work factors we obtain using our bound 
from Sect. 01 For the classical McEliece parameters (10, 50) this bound can be 
compared to the work factors computed by non-idealized algorithms. Canteaut 
and Chabaud m obtained a work factor of 2 64 ' 2 and Bernstein, Lange and 
Peters 0 a work factor of 2 60 5 . As one can see, the gap between our bound and 
their complexities is very small indicating two things: 

— our bound on ISD is tight when evaluating the practical security of some 
McEliece parameters, 

- the best ISD-based algorithms are sufficiently advanced to make our assump- 
tion that Gaussian elimination is free almost realistic. Almost no margin is 
left for these techniques to improve and better attacks will need to introduce 
new methods. 


5.2 Attacking the CFS Signature Scheme 

The attack we present here is due to Daniel Bleichenbacher, but was never pub- 
fished. We present what he explained through private communication including 
a few additional details. 

The CFS signature scheme m is based on the Niederreiter cryptosystem: 
signing a document requires to hash it into a syndrome and then try to decode 
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Table 3. Work factors for the ISD lower-bound we computed for some typical 
McEliece/Niederreiter parameters. The code has length n = 2 m and codimension 
r = mw and corrects w errors. 


(m,w) 

optimal p 

optimal t 

binary work factor 

(10, 50) 

4 

22 

2 59.9 

(11,32) 

6 

33 

2 86.8 

(12,41) 

10 

54 

2 128.5 


this syndrome. However, for a Goppa code correcting w errors, only a fraction 
Ay of the syndromes are decodable. Thus, a counter is appended to the message 
and the signer tries successive counter values until one hash is decodable. The 
signature consists of both the error pattern of weight w corresponding to the 
syndrome and the value of the counter giving this syndrome. 

Attacking this construction consists in forging a valid signature for a chosen 
message. One must find a matching counter and error pattern for a given doc- 
ument. This looks a lot like a standard CSD problem instance. However, here 
there is one major difference with the case of McEliece or Niederreiter: instead of 
having one instance to solve, one now needs to solve one instance among many 
instances. One chooses a document and hashes it with many different counters 
to obtain many syndromes: each syndrome corresponds to a different instance. 
It has no importance which instance is solved, each of them can give a valid 
“forged” signature. 

For ISD algorithms, having multiple instances available is of little help, how- 
ever, for GBA, this gives us one additional list. Even though Goppa code param- 
eters are used and an instance has less than a solution on average, this additional 
fist makes the application of GBA with a = 2 possible. This will always be an 
“unbalanced” GBA working as follows: 

— first, build 3 lists Lq, Li, and L 2 of XORs of respectively wq, w 1 and W 2 
columns of H (with w = wq + wi + w?)- These lists can have a size up to 
(™.) but smaller sizes can be used, 

— merge the two lists L 0 and Li into a fist L' 0 of XORs of w 0 + Wi columns of 
H, keeping only those starting with A zeros (we will determine the optimal 
choice for A later). L' 0 contains ” Wl ) elements on average. 

— All the following computations are done on the fly and additional lists do 
not have to be stored. Repeat the following steps: 

• choose a counter and compute the corresponding document hash (an 
element of the virtual list La), 

• XOR this hash with all elements of L 2 matching on the first A bits (to 
obtain elements of the virtual list L'J, 

• look up each of these XORs in L' 0 : any complete match gives a valid 
signature. 
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The number L of hashes one will have to compute on average is such that: 




L = 


The memory requirements for this algorithm correspond to the size of the largest 
list stored. In practice, the first level lists L, : can be chosen so that L' 0 is always 
the largest, and the memory complexity is 5 sr( wo + wl )- The time complexity cor- 
responds to the size of the largest list manipulated: max(^ ( Wo + Wl ) > L, yr ( ?J ” )). 
The optimal choice is always to choose wo = f^], W 2 = and wi = 

w — wo — Then, two different cases can occur: either L\ is the largest list, 
or one of L' 0 and L :i is. If L\ is the largest, we choose A so as to have a smaller 
list L r 0 and so a smaller memory complexity. Otherwise, we choose A so that 
L' 0 and L 3 are of the same size to optimize the time complexity. Let T be the 
size of the largest list we manipulate and M. the size of the largest list we 
store. The algorithm has time complexity O(TlogT) and memory complexity 
0(M\ogM) with: 


1 



This algorithm is realistic in the sense that only integer values are used, 
meaning that effective attacks should have time/memory complexities close to 
those we present in Table El Of course, for a real attack, other time/memory 
tradeoffs might be more advantageous, resulting in other choices for A and the u>i. 

5.3 Attacking the FSB Hash Function 

FSB P is a candidate for the SHA-3 hash competition. The compression func- 
tion of this hash function consists in converting the input into a low weight word 
and then multiplying it by a binary matrix H. This is exactly a syndrome com- 
putation and inverting this compression function requires to solve an instance 

Table 4. Time/memory complexities of Bleichenbacher’s attack against the CFS sig- 
nature scheme. The parameters are Goppa code parameters so r = mw and n = 2 rn . 



w = 8 

s 

II 

w = 10 

s 

II 

s 

II 

G 

m = 15 

m = 16 

m = 18 
m = 19 
m = 20 
m = 21 

m = 22 

2 B1.0/ 2 51.0 
2 54.1 /2 54.1 
2 57.2 /2 5T.2 
2 60.3 /2 60.3 
2 63.3 /2 63.3 
2 66.4 /2 66.4 
369 . 5/ 2 69. 5 
2 72.6 /2 72.6 

2 60.2 /2 43.3 

2 63.3 /2 46.5 

2 66.4 / 2 49.6 

2 69.5 /2 52.7 

2 72. 8/2 5 5 .7 

2 75.6 /2 58.8 

2 78.7 /2 61.9 

2 81.7 /2 65.0 

2 63.1 /2 5 5 .9 
2 66.2 /2 60.0 
2 69.3 /2 64.2 

2 72.4 / 2 68.2 

2 75.4 /2 72.3 

2 78. 5 /2 76.4 
2 81. 5/2 80.5 
2 84.6 /2 84.6 

2 67.2 /2 67.2 

2 71.3 /2 «.3 

2 75.4 / 2 75.4 

2 79.5 / 2 79.5 
2 83.6 /2 83.6 
2 87.6 /2 87.6 

2 91 -7 / 2 9 1 -7 

2 95.8 /2 95.8 

2 81. 6/2 54.9 

2 85.6 /2 59.0 

2 89.7 /2 63.1 

2 93.7 /2 67.2 

2 97.8 /2 71.3 

2 101.9 / 2 75.4 

2 105.9 / 2 79.5 
2 110.0 /2 83.6 
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Table 5. Complexities of the ISD and GBA bounds we propose for the official FSB 
parameters 






inversion 

collision 


n 

r 

w 

ISD 

GBA 

ISD 

GBA 

FSBieo 

5 X 2 1S 

640 

80 

2 211 ' 1 

3156.6 

3100.3 

3118.7 

FSB224 

7 x 2 18 

896 

112 

3292.0 

2 216.0 

3135.3 

2 163.4 

FSB256 

2 21 

1024 

128 

3330.8 

3245.6 

3153.0 

2 185.7 

FSB384 

23 x 2 16 

1472 

184 

3476.7 

2 360.2 

3215.5 

3268.8 

FSB512 

31 X 2 16 

1984 

248 

3687.8 

3482.I 

3285.6 

2 359.3 


of the CSD problem. Similarly, finding a collision on the compression function 
requires to find two low weight words having the same syndrome, that is, a word 
of twice the Hamming weight with a null syndrome. In both cases, the security of 
the compression function (and thus of the whole hash function) can be reduced 
to the hardness of solving some instances of the CSD problem. For inversion (or 
second preimage), the instances are of the form CSD (if, w, s ) and, for collision, 
of the form CSD(if, 2w, 0). 

Compared to the other code-based cryptosystems we presented, here, the 
number of solutions to these instances is always very large: we are studying a 
compression function, so there are a lot of collisions, and each syndrome has a lot 
of inverses. For this reason, both ISD and GBA based attacks can be used. Which 
of the two approaches is the most efficient depends on the parameters. However, 
for the parameters proposed in P, ISD is always the best choice for collision 
search and GBA the best choice for inversion (or second preimage). Table El 
contains the attack complexities given by our bounds for the proposed FSB 
parameters. As you can see, the complexities obtained with GBA for inversion 
are lower than the standard security claim. Unfortunately this does not give an 
attack on FSB for many reasons: the version of GBA we consider is idealized 
and using non-integer values of a is not practical, but most importantly, the 
input of the compression of FSB is not any word of weight w, but only regular 
words, meaning that the starting lists for GBA will be much smaller in practice, 
yielding a smaller a and higher complexities. 


Conclusion 

In this article we have reviewed the two main families of algorithms for solving 
instances of the CSD problem. For each of these we have discussed possible 
tweaks and described idealized versions of the algorithms covering those tweaks. 
The work factors we computed for these idealized versions are lower bounds 
on the effective work factor of existing real algorithms, but also on the future 
improvements that could be implemented. Solving CSD more efficiently than 
these bounds would require to introduce new techniques, never applied to code- 
based cryptosystems. 


Security Bounds for the Design of Code-Based Cryptosystems 103 


For these reasons, the bounds we give can be seen as a tool one can use to 

select parameters for code-based cryptosystems. We hope they can help other 

designers choose durable parameters with more ease. 

References 

1. Augot, D., Finiasz, M., Gaborit, Ph., Manuel, S., Sendrier, N.: SHA-3 proposal: 
FSB. Submission to the SHA-3 NIST competition (2008) 

2. Augot, D., Finiasz, M., Sendrier, N.: A family of fast syndrome based cryptographic 
hash function. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, 
pp. 64-83. Springer, Heidelberg (2005) 

3. Berger, T., Cayrel, P.-L., Gaborit, P., Otmani, A.: Reducing key length of 
the mceliece cryptosystem. In: Preneel, B. (ed.) AFRICACRYPT 2009. LNCS, 
vol. 5580, pp. 60-76. Springer, Heidelberg (to appear, 2009) 

4. Berlekamp, E.R., McEliece, R.J., van Tilborg, H.C.: On the inherent intractability 
of certain coding problems. IEEE Transactions on Information Theory 24(3) (1978) 

5. Bernstein, D., Buchmann, J., Ding, J. (eds.): Post-Quantum Cryptography. 
Springer, Heidelberg (2008) 

6. Bernstein, D., Lange, T., Peters, C.: Attacking and defending the McEliece cryp- 
tosystem. In: Buchmann, J., Ding, J. (eds.) PQCrypto 2008. LNCS, vol. 5299, pp. 
31-46. Springer, Heidelberg (2008) 

7. Bernstein, D.J., Lange, T., Peters, C., Niederhagen, R., Schwabe, P.: Implementing 
wagner’s generalized birthday attack against the sha-3 candidate feb. Cryptology 
ePrint Archive, Report 2009/292 (2009), http://eprint.iacr.org/ 

8. Bernstein, D.J., Lange, T., Peters, C., van Tilborg, H.: Explicit bounds for generic 
decoding algorithms for code-based cryptography. In: Pre-proceedings of WCC 
2009, pp. 168-180 (2009) 

9. Camion, P., Patarin, J.: The knapsack hash function proposed at crypto 1989 can 
be broken. In: Davies, D.W. (ed.) EUROCRYPT 1991. LNCS, vol. 547, pp. 39-53. 
Springer, Heidelberg (1991) 

10. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words in 
a linear code: Application to McEliece’s cryptosystem and to narrow-sense BCH 
codes of length 511. IEEE Transactions on Information Theory 44(1), 367-378 
(1998) 

11. Chose, P., Joux, A., Mitton, M.: Feist correlation attacks: An algorithmic point of 
view. In: Knudsen, L.R. (ed.) EUROCRYPT 2002. LNCS, vol. 2332, pp. 209-221. 
Springer, Heidelberg (2002) 

12. Coron, J.-S., Joux, A.: Cryptanalysis of a provably secure cryptographic hash func- 
tion. Cryptology ePrint Archive (2004), http://eprint.iacr.org/2004/013/ 

13. Courtois, N., Finiasz, M., Sendrier, N.: How to achieve a McEliece-based digital 
signature scheme. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 
157-174. Springer, Heidelberg (2001) 

14. Finiasz, M., Sendrier, N.: Security bounds for the design of code-based cryptosys- 
tems. Cryptology ePrint Archive, Report 2009/414 (2009), 

http : // eprint . iacr . org/ 

15. Lee, P.J., Brickell, E.F.: An observation on the security of McEliece’s public-key 
cryptosystem. In: Gunther, C.G. (ed.) EUROCRYPT 1988. LNCS, vol. 330, pp. 
275-280. Springer, Heidelberg (1988) 


104 M. Finiasz and N. Sendrier 


16. Leon, J.S.: A probabilistic algorithm for computing minimum weights of large 
error-correcting codes. IEEE Transactions on Information Theory 34(5), 1354-1359 
(1988) 

17. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. In: 
DSN Prog. Rep., Jet Prop. Lab., California Inst. Technol., Pasadena, CA, pp. 
114-116 (January 1978) 

18. Aguilar Melchor, C., Cayrel, P.-L., Gaborit, P.: A new efficient threshold ring sig- 
nature scheme based on coding theory. In: Buchmann, J., Ding, J. (eds.) PQCrypto 
2008. LNCS, vol. 5299, pp. 1-16. Springer, Heidelberg (2008) 

19. Minder, L., Sinclair, A.: The extended fc-tree algorithm. In: Mathieu, C. (ed.) Pro- 
ceedings of SODA 2009, pp. 586-595. SIAM, Philadelphia (2009) 

20. Misoczki, R., Barreto, P.S.L.M.: Compact McEliece keys from Goppa codes. Cryp- 
tology ePrint Archive, Report 2009/187 (2009), http://eprint.iacr.org/ 

21. Niederreiter, H.: Knapsack-type crytosystems and algebraic coding theory. Prob. 
Contr. Inform. Theory 15(2), 157-166 (1986) 

22. Stem, J.: A method for finding codewords of small weight. In: Wolfmann, J., Cohen, 
G. (eds.) Coding Theory 1988. LNCS, vol. 388, pp. 106-113. Springer, Heidelberg 
(1989) 

23. Stem, J.: A new identification scheme based on syndrome decoding. In: Stinson, 
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 13-21. Springer, Heidelberg (1994) 

24. Veron, P.: A fast identification scheme. In: IEEE Conference, ISIT 1995, Whistler, 
BC, Canada, September 1995, p. 359 (1995) 

25. Wagner, D.: A generalized birthday problem. In: Yung, M. (ed.) CRYPTO 2002. 
LNCS, vol. 2442, pp. 288-303. Springer, Heidelberg (2002) 

A Comments on the Assumptions 

We have assumed the following in Sect. 0 

(Bl) For all pairs (ei,e2) examined in the algorithm, the sums ei + e2 are 
uniformly and independently distributed in W n ,w 
(B 2 ) The cost of the execution of the algorithm is approximatively equal to 

t ■ tt(BA l)+i-tt(BA 2) + KTo • #(BA 3), 

where K 0 is the cost for testing ei H T = s + e 2 H T given that hi{e\H T ) = 
ht(s + e 2 H T ) and jj(BA i) is the expected number of execution of the 
instruction (ba i) before we meet the (SUCCESS) condition. 

The first assumption has to do with the way the attacker chooses the sets W\ 
and W 2 . In the version presented at the beginning of Sect. 0 they use different 
sets of columns and thus all pairs (ei,e2) lead to different words e = ei + e2- 
When W\ and W 2 increase, there is some waste, that is some words e = ei + e2 
are obtained several times. A clever choice of W\ and W 2 may decrease this 
waste, but this seems exceedingly difficult. The “overlapping” approach 


H 2 


H = 
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is easy to implement and behaves (almost) as if W\ and W 2 where random (it 
is even sometimes slightly better). The second assumption counts only i binary 
operations to perform the sum of w/2 columns of i bits. This can be achieved by 
a proper scheduling of the loops and by keeping partial sums. This was described 
and implemented in jOj . We also neglect the cost of control and memory handling 
instructions. This is certainly optimistic but on modern processors most of those 
costs can be hidden in practice. The present work is meant to give security levels 
rather than a cryptanalysis costs. So we want our estimates to be implementation 
independent as much as possible. 

Similar comments apply to the assumptions m and (13) of Sect. 0 

B A Sketch of the Proof of Proposition |2] 

We provide here some clues for the proof of Proposition 0 More details on this 
proof and on the proofs of the other results of this paper can be found in the 
extended version m 

Proof, (of Proposition 0 - Sketch) In one execution of (main loop) we exam- 
ine A(«)( fe ^) distinct value of e\ + e%, where £ = | PPi 1 1 1 / ( fe and A(z) = 
1 - exp(— z). The probability for one particular element of Wk+e, p to lead to a 
solution is 



P = 


min 


Thus the probability for one execution of (main loop) to lead to (success) is 



When Np{£) is large (much larger than 1), we have P p (£) fes X(z)/N p (£ ) and a 
good estimate for the cost is 



Choosing \Wi\, W 2 1, £ and z which minimize this formula leads to the first 
formula of the statement. 

Else we have N p (i) < 1 and the expected number of execution of (main 
loop) is not much higher than one (obviously it cannot be less). In that case 
we are in a situation very similar to a birthday attack in which the list size is 
L = \/l/P = T'/ 2 / J ■ This gives a cost of 2L\og(K vl _ p L) which has to 
be minimized in £, leading to the second formula of the statement. 
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Abstract. In this work, we apply the rebound attack to the AES based 
SHA-3 candidate Lane. The hash function Lane uses a permutation 
based compression function, consisting of a linear message expansion 
and 6 parallel lanes. In the rebound attack on Lane, we apply several 
new techniques to construct a collision for the full compression function 
of Lane-256 and Lane-512. Using a relatively sparse truncated differen- 
tial path, we are able to solve for a valid message expansion and collid- 
ing lanes independently. Additionally, we are able to apply the inbound 
phase more than once by exploiting the degrees of freedom in the parallel 
AES states. This allows us to construct semi-free-start collisions for full 
Lane-256 with 2 96 compression function evaluations and 2 88 memory, 
and for full Lane-512 with 2 224 compression function evaluations and 
2 128 memory. 

Keywords: SHA-3, LANE, hash function, cryptanalysis, rebound at- 
tack, semi-free-start collision. 


1 Introduction 

In the last few years the cryptanalysis of hash functions has become an important 
topic within the cryptographic community. The attacks on the MD4 family of 
hash functions (MD5, SHA-1) have especially weakened the confidence in the 
security of this design strategy jl 311 1 j . Many new and interesting hash function 
designs have been proposed as part of the NIST SHA-3 competition |Hj. The 
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large number of submissions and different design strategies require different and 
improved cryptanalytic techniques as well. 

At FSE 2009, Mendel et al. published the rebound attack |2| - a new technique 
for analysis of hash functions which has been applied first to reduced versions of 
the Whirlpool |2| and Gr0stl jjj compression functions. Recently, the rebound 
attack on Whirlpool has been extended in jH|, which in some parts is similar to 
our attack. The main idea of the rebound attack is to use the available degrees 
of freedom in the internal state to efficiently fulfill the low probability parts in 
the middle of a differential trail. The straight-forward application of the rebound 
attack to AES based constructions allows a quick and thorough analysis of these 
hash functions. 

In this work, we improve the rebound attack and apply it to the SHA-3 candi- 
date Lane. The hash function Lane jS| uses an iterative construction based on 
the Merkle-Damgard design principle 131 1 01 and has been first analyzed in m- 
The permutation based compression function consists of a linear message ex- 
pansion and 6 parallel lanes. The permutations of each lane are based on the 
round transformations of the AES. In the rebound attack on Lane, we first 
search for differences and values, according to a specific truncated differential 
path. This truncated differential path is constructed such that a collision and 
a valid expanded message can be found with a relatively high probability. By 
using the degrees of freedom in the chaining values, we are able to construct a 
semi-free-start collision for the full versions of Lane-256 with 2 96 compression 
function evaluations and memory of 2 88 , and for Lane-512 with 2 224 compres- 
sion function evaluations and memory of 2 128 . Although these collisions on the 
compression function do not imply an attack on the hash functions, they violate 
the reduction proofs of Merkle and Damgard, and Andreeva [D . 

2 Description of Lane 

The cryptographic hash function Lane jS] is one of the submissions to the NIST 
SHA-3 competition d, It is an iterated hash function that supports four digest 
sizes (224, 256, 384 and 512 bits) and the use of a salt. Since Lane-224 and 
Lane-256 are rather similar except for truncation, we write Lane-256 whenever 
we refer to both of them. The same holds for Lane-384 and Lane-512. 

The hashing of a message proceeds as follows. First, the initial chaining value 
H_%, of size 256 bits for Lane-256, and 512 bits for Lane-512, is set to an initial 
value that depends on the digest size n and the optional salt value S. At the same 
time, the message is padded and split into message blocks M,; of length 512 bits 
for Lane-256, and 1024 bits for Lane-512. Then, a compression function / is 
applied iteratively to process message blocks one by one as Hi = Mi, C,;), 

where Ci is a counter that indicates the number of message bits processed so 
far. Finally, after all the message blocks are processed, the final digest is derived 
from the last chaining value, the message length and the salt by an additional 
call to the compression function. 
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2.1 The Compression Function 

The compression function of Lane-256 transforms 256 bits (512 in the case of 
Lane-512) of the chaining value and 512 bits (resp. 1024 bits) of the message 
block into a new chaining value of 256 bits (512 bits). It uses a 64-bit counter 
value 6V For the detailed structure of the compression function we refer to 
the specification of Lane jSj. First, the chaining value and the message block 
are processed by a message expansion that produces an expanded state with 
doubled size. Then, this expanded state is processed in two layers. The first 
layer is composed of six permutation lanes Po,. . . ,P$ in parallel, and the second 
layer of two parallel lanes Qo,Qi- 

2.2 The Message Expansion 

The message expansion of Lane takes a message block Mi and a chaining value 
i and produces the input to six permutations Lb,. . . J\. In Lane-256, the 
512-bit message block Mj is split into four 128-bit blocks mo, mi, m2, m3 and 
the 256-bit chaining value Hi- 1 is split into two 128-bit words /to, hi as fol- 
lows mo||mi||m2||m3 <— Mi, ho\\hi <— Hi- 1. Then, six more 128-bit words 
do, an, bo, h\, Co, ci are computed 

ao = ho® mo ® mi © m2 © m3 , a\ = h\® mo © m2 , 

bo = ho © hi © mo © m2 © m3 , bi = ho® mi © m2 , (1) 

Co = ho © hi © mo © mi © m2 , ci = ho® mo © m3 . 

Each of these 128-bit values, as in AES, can be seen as 4 x 4 matrix of bytes. 

In the following, we will use the notion x[i, j] when we refer to the byte of the 
matrix x with row index i and column index j, starting from 0. 

The values ao||ai, bo\\bi, co||ci, ho\\hi, mo||mi, m 2 ||m 3 become inputs to the 
six permutations Po, ■ ■ ■ , P5 described below. The message expansion for larger 
variants of Lane is identical but all the values are doubled in size. 

2.3 The Permutations 

Each permutation lane Pi operates on a state that can be seen as a double AES 
state (2 x 128-bits) in the case of Lane- 256 or quadruple AES state (4 x 128- 
bits) for Lane- 512. The permutation reuses the transformations SubBytes (SB), 
ShiftRows (SR) and MixColumns (MC) of the AES with the only exception, that 
due to the larger state size, they are applied twice or four times in parallel. 

Additionally, there are three new round transformations introduced in Lane. 
AddConstant adds a different value to each column of the lane state and AddCounter 
adds part of the counter Ci to the state. Since our attacks do not depend on these 
functions, we skip their details here. The third transformation is SwapColumns 
(SC) - used for mixing parallel AES states. Let Xi be a column of a lane state. In 
Lane-256, SwapColumns swaps the two right columns of the left half-state with the 
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two left columns of the right half-state, and in Lane-512, SwapColumns ensures 
that each column of an AES state gets swapped to a different AES state: 

<9(7256 (*o]]*l| I • ■ ■ ||*7) = *0 1 1 *1 1 1 *4 1 1 *5 1 1 *2 1 1 *3 1 1 *6 1 1 *7 
<S'C' 51 2(Z0||Z 1 || ■ • -||*15) = *0 1 1*4 1 1*8 1 1*12 1 1*1 1 |*5 1 |*9 1 1*13 1 1 

*2 1 1*6 1 1 *10 1 1*14| |*3 ||*7||*ll||*15 • 

The complete round transformation consists of the sequential application of all 
these transformations in the given order. The last round omits AddConstant and 
AddCounter. Each of the permutations Pj consists of six rounds in the case of 
Lane- 256 and eight rounds for Lane-512. 

The permutations Qo and Qi are irrelevant to our attack because we will get 
collisions before these permutations. An interested reader can find a detailed 
description of Qo and Qi in jSj . 

3 The Rebound Attack on Lane 

In this section first we give a short overview of the rebound attack in general 
and then, describe the different phases of the rebound attack on Lane in detail. 

3.1 The Rebound Attack 

The rebound attack was published by Mendel et al. in 0 and is a new tool 
for the cryptanalysis of hash functions. The rebound attack uses truncated dif- 
ferences © and is related to the attack by Peyrin ca on the hash function 
Grindahl [Z|. The main idea of the rebound attack is to use the available degrees 
of freedom in the internal state to fulfill the low probability parts in the middle 
of a differential path. It consists of an inbound and subsequent outbound phase. 
The inbound phase is an efficient meet-in-the-middle phase, which exploits the 
available degrees of freedom in the middle of a differential path. In the mostly 
probabilistic outbound phase, the matches of the inbound phase are computed 
backwards and forwards to obtain an attack on the hash or compression function. 
Usually, the inbound phase is repeated many times to generate enough starting 
points for the outbound phase. In the following, we describe the inbound and 
outbound phase of the rebound attack on Lane. 

3.2 Outline of the Rebound Attack on Lane 

Due to the message expansion of Lane, at least 4 lanes are active in a differential 
attack. We will launch a semi-free-start collision attack, and therefore we assume 
the differences in (ho, hi) to be zero. Hence, lane P 3 is not active and we choose 
Pi and thus, (bo,bi) to be not active as well. The active lanes in our attack 
on Lane are Pq, P 2 , P 4 and P 5 . The corresponding truncated differential path 
for the P-lanes of Lane-256 is shown in Fig. [21 This path is very similar to 
the truncated differential path for Lane-256 shown in the Lane specification 
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Fig. 1. The inbound phase for Lane- 256 (left) and Lane- 512 (right). Black bytes are 
active, gray bytes fixed by solutions of the inbound phase. 


[Fig. 4.2, page 33], but turned upside-down. The truncated differential path used 
in the attack on Lane-512 is the same as in the Lane specification [Fig. 4.3, 
page 34] and shown in Fig. 01 The main idea of these paths is to use differences 
in only one of the parallel AES states for the inbound phases. This allows us 
to use the freedom in the other states to satisfy the outpound phases. Since we 
search for a collision after the P-lanes, we do not need to consider the Q-lanes. 

The main idea of the attack on Lane is that we can apply more than one 
efficient inbound phase by using the degrees of freedom and the relatively slow 
diffusion due to the 2 (or 4) parallel AES states of Lane-256 (or Lane-512). The 
positions of the active bytes of two consecutive inbound phases are chosen such 
that when merging them, the number of the common active bytes of these phases 
is as small as possible. Since we can find many independent solutions for these 
inbound phases, we store them in some lists to be merged. In the outbound 
phase of the attack we merge the results of the inbound phases and further, 
merge the results of all active P-lanes. Note that the merging of two lists can be 
done efficiently. In each merging step, a number of conditions need to be fulfilled 
for the elements of the new list. We merge the lists in a clever order, such that 
we find one colliding pair for the compression function at the end. 

In more detail, we first filter the results of each inbound phase for those 
solutions, which can connect both inbound phases (see Fig. 0). Then, we merge 
the resulting lists of two lanes such that we get a collision after the P-lanes, 
and parts of the message expansion are fulfilled. Finally, we filter the results of 
the left P-lanes (Po, P2) and the right P-lanes (P4, P5), such that the conditions 
on the whole message expansion are fulfilled. In the attack, we try to keep the 
size of the intermediate results at a reasonable size. We need to ensure, that the 
complexity of generating the lists is below 2"/ 2 , but still get enough solutions in 
each phase to continue with the attack. 
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Fig. 2. The truncated differential path for 6 rounds of Lane-256. Black bytes are active, red (gray) bytes correspond to the first inborn 
phase, gray (dark gray) bytes to the second inbound phase and blue (light gray) bytes are used to find collisions in the P-lanes (colo 
in brackets correspond to grayscale printing). 
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3.3 The Inbound Phase 

In the rebound attack on Lane, we first apply the inbound phase for a number of 
times. Therefore, we will explain this phase and the corresponding probabilities 
in detail here. In the inbound phase, we search for differences and values conform- 
ing to the truncated differential path for Lane-256 or Lane-512 shown in Fig. [0 
with active bytes marked by black bytes. We only describe the application of one 
inbound phase here. In the example of Fig. |T| we have 16 active S-boxes between 
state #4 and state #5. It follows from the MDS property of MixColumns, that 
this path has at least one active byte in each of the 4 corresponding columns 
prior to the first, and after the second MixColumns transformation (state #2 and 
state #7). Note that the active bytes in state #2 and state # 7 can also be at 
any position marked by gray bytes. 

In the inbound phase, we first choose random differences for the 4 active 
bytes after the second MixColumns transformation (state #7). These differences 
are linearly propagated backward to 16 active bytes at the output of the previous 
SubBytes layer (state #5). Next, we take random differences for the 4 active bytes 
prior to the first MixColumns transformation (state # 2 ) and linearly propagate 
forward to 16 active bytes at the input of SubBytes (state #4). Then, we need 
to find a match for the input and output differences of all 16 active S-boxes. For 
a single S-box, the probability that a random S-box differential exists is about 
one half, which can be verified easily by computing the differential distribution 
table of the AES S-box (see |0j for more details). 

For each matching S-box, we get at least two (in some cases 4) possible byte 
values such that the S-box differential holds. Hence, we get at least 2 16 possible 
values for one full AES state, such that the differential path for the chosen 
differences in state #2 and state #7 holds. In other words, after trying 2 16 non- 
zero differences of state #2 and state #7, we get at least 2 16 solutions for the 
truncated differential path between state #2 and state #7. Hence, the average 
complexity to find one solution for the inbound phase (differences and values) is 
about 1. Note that this holds for both, Lane-256 and Lane-512. 

3.4 The Outbound Phase 

After we have found differences and values for each inbound phase of the active 
lanes, we need to connect these results and propagate them outwards in the 
outbound phase. In backward direction, we need to match the message expansion 
at the input of each lane. In forward direction, we need to match the differences 
of two P-lanes on each side to get a collision. We describe the conditions for 
these two parts according to our truncated differential path in the following. 
The Message Expansion. After the inbound phases, we get values and differ- 
ences at the input and output of the 4 active lanes Po, Pi, Pi and P 5 . Since we 
have zero differences in (ho, hi) and ( 60 , 61 ), we get using the message expansion 
for lane Pi (see Equation (JU): 

Abo = 0 = Amo © Am2 © Am% , Abi = 0 = Ami © Am2 
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Hence, we get the following relation for the message differences in mo, mi, m2, 
and m 3 : 

Ami = Am2 = Amo © dm3 (2) 

Using dU we get for the differences in the expanded message words (ao, ai) and 
(coj Ci): 

Aao = Am\ , Aai = Am 3 , Acq = Amo , Aci = Am 2 (3) 

and thus, the following relations between ao, ai, Co, and ci: 

Aao = Aci = Aai © Acq (4) 

Beside the differences, we also need to match the values of the message expansion. 
Since we aim for a semi-free-start collision, we can freely choose the chaining 
value (ho, hi) such that the conditions on (ao,ai) are satisfied: 

ho = ao® mo © mi © m2 © m3 , hi = ai ® mo © m2 

That means we have conditions on the input (co, ci) left, which we need to match 
with the message words mo, mi, m2 and m3. Since we can vary lanes Po,Pi and 
P<i,Po independently in the following attacks, we can satisfy these conditions by 
merging the results of both sides. Using the equations of the message expansion, 
we get for (co,ci) using the values of (ao,ai): 

co = ao © ai © mo © m2 © m3 , ci = ao © mi © m2 

We can rearrange these equations in order to have all terms corresponding to 
Po,Pi on the left side and all terms of Pi,P& on the right side: 

mo © m2 © m3 = co © ao © ai , mi © m2 = ci © ao (5) 

For merging the two sides, we will compute, store and compare the following 
values of each list: 

v\ = co © ao © ai , V2 = ci © ao , V3 = mo © m2 © m3 , V4 = mi © m2 

Colliding P-Lanes. In the forward direction, we need to find a collision for the 
differences in P 0 and P 2 , such that AP 0 © AP 2 = 0 and for the differences in P4 
and P 5 , such that AP 4 © AP 5 = 0. Note that we can swap the order of the last 
MixColumns with the XOR operation of the P-lanes since both transformations 
are linear. Hence, we only need to match the differences after the last SubBytes 
layer in each of the two active lanes. The blue bytes in Fig. Elof Lane- 256, or 
the red, blue and yellow bytes in Fig. 0 of Lane- 512 are independent of the 
inbound phase. Hence, we can use the freedom in these bytes to find a collision 
after the P-lanes. 
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4 Semi-Free-Start Collision for Lane-256 

In the rebound attack on Lane-256, we construct a semi-free-start collision for 
the full compression function using 2 96 compression function evaluations and 
memory requirements of 2 88 . We will use the 6-round truncated differential path 
given in Fig. |2| which is very similar to the one shown in the Lane specification 
[Fig. 4.2, page 33]. We search for a collision after the P-lanes of Lane and use 
the same truncated differential path in the 4 active lanes Pq, P 2 , Pa and P5. Since 
we do not consider differences in ho and hi, but we fix their values, the result 
will be a semi-free-start collision. The attack on Lane-256 consists basically of 
the following parts: 

1. First Inbound Phase: Apply the inbound phase at the beginning of the 
truncated differential path (state #2 to state #7) for each lane Po, P2, P4, 
P5 independently. 

2. Second Inbound Phase: Apply the inbound phase in the middle of each 
lane again (state #10 to state #15). 

3. Merge Inbound Phases: Merge the results of the two inbound phases 
(state #7 to state #10). 

4. Merge Lanes: Merge the two neighboring lanes Po,p 2 and Pa,Ps and satisfy 
according differences of the message expansion. 

5. Message Expansion: Merge the two sides (Po, P2) and (P4, P5) and satisfy 
the remaining conditions on the message expansion (differences and values). 

6. Find Collisions: Choose remaining free values (neutral bytes) to find a 
collision for each side (P 0 ,P2) and (P4, P5) independently. 

7. Message Expansion: Merge the two sides (Po, P2) and (P4, P5) and satisfy 
the conditions on the message expansion of the remaining bytes. 

4.1 First Inbound Phase 

We start the attack on Lane-256 by applying the first inbound phase to each 
of the 4 active lanes Po, P2, P4, P5 independently. In each lane, we start with 5 
active bytes in state #2 and 8 active bytes in state #7 and choose 2 96 random 
non-zero differences for these 13 bytes (note that we could choose up to 2 104 
differences). We propagate backward and forward to 16 active bytes at the input 
(state #4) and output (state #5) of the SubBytes layer in between. We get at 
least 2 96 solutions for the inbound phase with a complexity of 2 96 (see Sect. 13.31) . 
For each result, only the red and black bytes in Fig. |21 are determined, i.e. the 
differences as well as the actual values of the bytes are found. Note that we 
have chosen the position of active bytes in state #0, such that at least one term 
of Equation 0 or 0) is zero for each byte. At this point, we can compute 
backwards to state #0 and independently verify the condition on one byte of 
the input differences: 


P 0 : Aa o [0,0] = Aai[0,0] 
P 2 : Ac 0 [2,3] = Aci[2,3], 


Pa: Am 0 [2,3] = Ami[2,3] 
P5 : Am2[0 , 0] = Ato 3[0, 0] 
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The condition on each of these bytes is fulfilled with a probability of 2 -8 and we 
store the 2 88 valid results of each lane Po, P 2 , P4 and P5 in the corresponding 
lists Lo, L 2 , L 4 and L 5 . Note that we store the values and differences of state 
#10 (red and black bytes) in these lists, since we need to merge these bytes with 
the second inbound phase in the following. For an efficient merging step, the 
fists are stored in hash tables (or sorted) according to the bytes to be merged 
(diffences and values of active bytes in state #10). 

4.2 Second Inbound Phase 

Next, we apply the inbound phase again to match the differences at SubBytes 
between state #12 and state #13. We start with 2 64 differences in the 8 active 
bytes of state #10 and 2 32 differences in the 4 active bytes of state #15. Hence, 
we get about 2 96 solutions for the second inbound phase with a complexity of 
2 96 . For each result, the gray and black values in Fig. 0 between state #7 and 
state #18 are determined. Again, this means we fix the actual values of these 
bytes. The results of the second inbound phase for each lane are stored in fists 
L' 0 , L' 2 , L' 4 and L' 5 . A node of each fists holds the values and differences of state 
#10 (gray and black bytes). Again, the fists are stored in hash tables (or sorted) 
according to the bytes (black bytes) to be merged. 

4.3 Merge Inbound Phases 

The two previous inbound phases overlap in 8 active bytes (state #7 to state 
#10). We connect the two inbound phases by checking the conditions on the 
overlapping bytes of state #10. Since both values and differences need to match, 
we get a condition on 128 bits. We merge the 2 88 results of the first inbound 
phase and 2 96 results of the second inbound phase to get 2 88 x 2 96 x 2“ 128 = 2 56 
differential paths for each lane. A pair connecting both inbound phases is found 
trivially. For each node of the first fist (for example Lo), we check the overlapping 
bytes against the values of the second fist ( L ' 0 ). Since the second fist is a hash 
table, the effort for producing all 2 56 valid pairs is 2 88 hash table lookups. 

Note that for each pair which satisfies and connects both inbound phases, 
the differences and values between state #0 and state #18 (black, red and gray 
bytes) are determined. We compute and store the 2 56 input values and differences 
of state #0 in fists Lo, L 2 , L 4 and L 5 . Altough we still do not know half of the 
state, each of these input pairs conforms to the whole truncated differential path 
from state #0 to state #24 with a probability of 1. In other words, we know 
that in state #24, there are at most the given bytes active. 

4.4 Merge Lanes 

Next, we continue with merging the solutions of each lane by considering the 
message expansion. We first combine the inputs of lane Po and P 2 by merging 
fists Lq and L- 2 . When merging these fists, we need to satisfy the conditions on 
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the differences of the message expansion. We have conditions on 5 active bytes 
of state #0 in lane Po and P 2 (see Fig. EJ. Remember that we have chosen the 
position of these active bytes, such that at least one term of Equation 0 or © 
is zero. Hence, we only need to check if two corresponding byte differences are 
equal. Since we have already verified one byte difference (see Sect. 14.11) . we have 
4 byte condition left: 

Aa o [0,0] = Z\ci[0,0] , zAai[0, 1] = /Ac 0 [0, 1] (6) 

Aa t [1,1] = Aco[l, 1] , Aao[2, 3] = Aco[ 2, 3] (7) 

These conditions are fulfilled with a probability of 2“ 32 and by merging two lists 
(L 0 and P 2 ) of size 2 56 , we get 2 56 x 2 56 x 2 -32 = 2 80 valid matches which we 
store in list Po2- We repeat the same for lane Pa and P5 by merging lists L 4 and 
L 5 . We get 2 80 matches for list L45 as well, since we need to fulfill the 32-bit 
conditions on the differences of the following 4 bytes: 

Ami[0,0\ = Am2[0,0 \ , Amo[0, 1] = Am^lO, 1] (8) 

Amo[l,l] = Am^[l,l] , Amo[2, 3] = 4rri2[2, 3] (9) 

Again, if we use hash tables or the previous lists are sorted according to the 
bytes to match, the merge operation can be performed very efficiently. Hence, 
the total complexity to produce the lists P02 and P45 is determined by their final 
size and requires an effort of around 2 80 computations. 

4.5 Message Expansion 

For all entries of the lists P02 and L45, the values in 32 bytes and differences in 
10 bytes of each of (ao, a\, Co, ci) and (mo, mi, m2, m3) have been fixed (red and 
black bytes in state #0 of Fig. 0). Note that the conditions on the differences of 
each side on its own have already been fulfilled (Po <-> P 2 and P4 <-*• P5). Hence, 
if we just fulfill the conditions on the remaining differences between Po <-*• P4, 
then the conditions on P 2 <-> P5 are satisfied as well. Using Equations ©-©, 
the position of active bytes in Fig. El and the already matched differences of 
Sect. 14. H and Sect. 14.41 we only have the following 4 byte conditions left: 

Aao[0, 0] = Ami[0, 0] , Aai[0, 1] = Amo[0, 1] 

Aa i[l, 1] = Amo[l, 1] , Aa o[2, 3] = Amo[2, 3] 

Note that we also need to fulfill the conditions on the values of the states. 
Remember that we can freely choose the chaining values (ho, hi) to satisfy the 
values in the first 16 bytes of the message expansion (a 0, ai). To fulfill the con- 
ditions on the 16 bytes of (co,ci) we need to satisfy Equation © using the 
corresponding values v\, u 2 , V3 and v 4 . Hence, we need to find a match for the 
following values and differences by merging lists P02 and P45: 
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— 8 bytes of v\ from Lq- 2 with V3 from L45, 

— 8 bytes of V2 from L02 with V4 from L45, 

— 4 bytes of differences in L02 and in L45. 

Since we have 2 80 elements in each list and conditions on 160 bits, we expect to 
find 2 80 x 2 80 x 2 -160 = 1 result. This result satisfies the message expansion for 
all lanes and is a solution for the truncated differential path of each active lane 
between state #0 and state #24. However, we do not get a collision at the end 
of the P-lanes yet, since we do not know the differences of state #24. 

4.6 Find Collisions 

In this phase of the attack, we search for a collision at the end of the P-lanes 
(Po, P2) and (P4, P5) using the remaining freedom in the second half of the state. 
Note that the 16-byte difference in state #24 is obtained from 8-byte difference 
in state #22 with the linear transforms MixColumns and SwapColumns. Hence, 
the collision space (the 16 bytes where the two lanes differ) has only 2 64 distinct 
elements. If we take a look at Fig. El we get for the values in state #7: 

— The black, red and gray bytes represent values which have already been 
determined by the previous parts of the attack. 

— The blue bytes represent values not yet determined and can be used to vary 
the differences in state #22. 

To find a collision between two lanes, we can still choose 2 64 values for the blue 
bytes in state #7 of each lane and store these results in lists Lq, L2, L4 and 
L5. Note that for these 2 64 values, we get only 2 32 different values for the two 
free bytes in the first and fifth column of state #18. Hence, we can only iterate 
through 2 32 differences in state #22 for each lane. However, this is enough to 
find one colliding difference for each side, since 2 32 x 2 32 x 2 -64 = 1. By repeating 
this step 2 32 times for each side, we expect 2 64 x 2 64 x 2 -64 = 2 64 results for 
each merged list T02 and T 45 . 

4.7 Message Expansion 

Finally, we need to match the message expansion for the remaining 32 bytes 
of each side. Hence, we just repeat the same procedure as we did for the first 
half of state #0, except that we only need to match the values of 32 bytes but 
no differences. Again, we can use the remaining bytes of {ho, hi) to fulfill the 
conditions on 16 bytes of (ao,ai). Since, we have 2 64 solutions in each list L 0 2 
and T45, we expect to find 2 64 x 2 64 x 2 -128 = 1 colliding pair for (co,ci) and 
thus, a collision for the full compression function of Lane-256. 

4.8 Complexity 

Let us find the complexity of the whole attack. The first inbound phase requires 
2 96 computations and 2 88 memory, the second inbound requires 2 96 computations 
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and 2 96 memory, and the merging of the inbound phases requires 2 88 hash table 
lookups and 2 56 memory. Obviously, the second inbound phase and the merge 
inbound phases can be united to lower the memory requirement of these three 
steps. Namely, we create the lists L 0 , L 2 , L 4 and L 5 in the first inbound phase. 
Then, for each differential path of the second inbound phase, instead of storing 
it in a list, we immediately check if it can be merged with some differential from 
the lists. Only if it can be merged, we do the outbound phase and compute state 
#0. Hence, the first three steps of our attack require around 2 96 computations 
and 2 88 memory. The merge lanes step requires 2 80 computations and memory. 
The message expansion steps require 2 80 computations, while the find collisions 
steps require 2 32 computations. Hence, the total attack complexity is around 
2 96 computations and 2 88 memory. Note that the cost of each computation is 
never greater than the cost of one compression function evaluation. Therefore, 
the complexity to find a semi-free-start collision for all 6 rounds of Lane-256 is 
about 2 96 compression function evaluations and 2 88 memory. 

5 Semi-Free-Start Collision for Lane-512 

In the rebound attack on Lane-512, we construct a semi-free-start collision for 
the full, 8-round compression function using 2 224 compression function evalu- 
ations and memory requirements of 2 128 . We use the same iterative truncated 
differential path as shown in the specification of Lane-512 [Fig. 4.3, page 34], 
which is given in Fig. 01 Similar to the attack on Lane-256, we search for a 
collision after the P-lanes and use the same truncated differential path in the 4 
active lanes Po, Pi, Pi and P5. The attack on Lane-512 consists basically of the 
following parts: 

1. First Inbound Phase: Apply the inbound phase at the beginning of the 
truncated differential path (state #2 to state #7) for each lane Po, P 2 , Pi, 
P5 independently. 

2. Merge Lanes: Merge the two neighboring lanes Pq,P 2 and P 4 ,Py, and satisfy 
according differences of the message expansion. 

3. Message Expansion: Merge the two sides (Po, P 2 ) and (P4, P5) and satisfy 
the remaining conditions on the message expansion (differences and values). 

4. Second Inbound Phase: Apply the inbound phase in the middle of each 
lane again (state #10 to state #15). 

5. Merge Inbound Phases: Merge the results of the two inbound phases. 

6. Starting Points: Choose random values for the brown bytes in state #7 to 
get enough starting points for the subsequent phases. 

7. Merge Lanes: Merge the values of the starting points for the two neigh- 
boring lanes Po,P2 and P 4 J\ and satisfy the according differences of the 
message expansion. 

8. Message Expansion: Merge the two sides (Po, P2) and (P4, P5) and satisfy 
the remaining conditions on the message expansion (differences and values) 
for the starting points. 


Rebound Attack on the Full Lane Compression Function 119 


9. Third Inbound Phase: Apply the inbound phase at the end of each lane 
for a third time (state #18 to state #23). 

10. Merge Inbound Phases: Merge the results of the three inbound phases 
and use the remaining freedom in between. 

11. Find Collisions: Merge the corresponding two lanes to find a collision for 
each side (Po, P 2 ) and (Pi- Pg) independently. 

12. Message Expansion: Merge the two sides (Po, P 2 ) and (P4, P5) and satisfy 
the conditions on the message expansion of the remaining bytes. 

5.1 First Inbound Phase 


We start the attack on Lane-512 by applying the first inbound phase to each 
of the 4 active lanes Po, P2, P4, P5 independently. In each lane, we start with 8 
active bytes in state #2 and 4 active bytes in state #7 and choose 2 84 random 
non-zero differences for these 12 bytes (note that we could choose up to 2 96 
differences). We propagate backward and forward to 16 active bytes at the input 
(state #4) and output (state #5) of the SubBytes layer in between. We get at 
least 2 84 matches for the inbound phase with a complexity of 2 84 (see Sect. 13.31) . 
For each result, the gray and black bytes in Fig. 0 are determined. Hence, we 
can already verify the condition on one byte of the input differences for each 
lane by computing backwards to state #0: 


P 0 : Ar 0 [2,2] = Au[2,2], 
P 2 -.Ac 0 [1,1] = Ac 1 [1,1}, 
P4 : Z\mo[l, 1] = 4mi[l, 1] , 
P 5 : Am 2 [2,2] = Am 3 [2,2] , 


P 0 : A*o[2,6] = An [2, 6] 
P 2 : A 0 [l,5] = Aq[1, 5] 
P4 : Ano[l,5] = Aroi[l,5] 
P 5 : Ani2 [2, 6] = /Ara 3 [2,6] 


The conditions on each of the lanes are fulfilled with a probability of 2“ 16 and we 
store the 2 68 valid matches of each lane Po, P 2 , P4 and P5 in the corresponding 
lists Lq, P 2 , Li and P5. 


5.2 Merge Lanes 

Next, we continue with merging the solutions of each lane by considering the 
message expansion. We first combine the results of lane Po and P 2 by merging 
lists Lo and P 2 . When merging these lists, we need to satisfy the conditions on 
the differences of the message expansion for the following 6 bytes: 

A*i [0,0] = /A Co [0,0] , Ah[0,4] = A 0 [0,4] 

At 0 [l,l] = Acq[1, 1] , At 0 [l,5] = A*> [1,5] 

At 0 [2,2] = /Ac i[2,2] , At 0 [2,6] = A* [2, 6] 

Since this match is fulfilled with a probability of 2“ 48 and we merge two lists 
of size 2 68 , we get 2 68 x 2 68 x 2 -48 = 2 88 valid matches which we store in Lq 2 . 
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Fig. 3. The truncated differential path for 8 rounds of Lane- 512. Lane Po shows the 
plain truncated differential path, lane P2 other possible truncated differential paths 
and lane P4 and Pg axe used to describe the attack. 
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We repeat the same for lane P4 and P5 merge lists L 4 and L 5 . We get 2 88 matches 
for list L45, since we need to fulfill conditions on differences of 6 bytes as well: 

Zimo[0,0] = Zim3[0,0] , Zimo[0,4] = Z\to 3[0,4] 

Amo[l, 1] = Am 2 [l, 1] , 2lmo[l, 5] = Am 2 \\, 5] 
k\mi[2,2] = Am 2[2,2] , k\mi[2,6] = k\m.2[2,6] 

5.3 Message Expansion 

For all entries of lists L02 and L45, the values in 32 bytes and differences in 16 
bytes of each of (ao, ai, Co, ci) and (mo, mi, m2, m3) have been fixed (gray and 
black bytes in state #0 of Fig. EJ. Since the conditions on the differences of each 
side on its own have already been fulfilled, we just need to match the conditions 
on the remaining 6-byte differences between each side (P 0 ,P2) and (PijPs): 

Z\ai[0,0] = Z\mo[0,0] , ziai[0,4] = /Amo[0,4] 

Ziao[l, 1] = Ama[l, 1] , /Aa 0 [l, 5] = /Amo[l, 5] 

Aaol 2. 2] = Am i[2, 2] , Aa 0 [ 2, 6] = Am i[2, 6] 

Remember that we can freely choose the chaining values (ho, hi) to satisfy the 
values in the first 16 bytes of the message expansion (a 0 , «i)- To fulfill the condi- 
tions on the 16 bytes of (co, ci) we need to find matches for the following values 
and differences using lists L02 and L45: 

— 8 bytes of v 4 from L 02 with v- 4 from P45, 

— 8 bytes of V 2 from L 02 with v 4 from P45, 

— 6 bytes of differences in L 02 and in P45. 

Since we have 2 88 elements in each fist and conditions on 176 bits, we expect to 
find 2 88 x 2 88 x 2 -176 = 1 result. This result satisfies the message expansion for 
all lanes and is a solution for the truncated differential path of each active lane 
between state #0 and state #10. 

5.4 Second Inbound Phase 

Next, we apply the inbound phase again to match the differences at SubBytes 
between state #12 and state #13. After the first inbound phase, the values of 
16 bytes in state #10 (black and gray bytes), and the difference in 16 bytes (1st 
AES-block) of state #12 (black bytes) have already been fixed. Hence we can 
start with 2 32 possible 4-byte differences in state #15, compute backwards to 
state #13 and need to match the differences in the SubBytes layer. We expect 
to find at least 2 32 solutions for the second inbound phase (see Sect. 13.311 . 

5.5 Merge Inbound Phases 

The result of the second inbound phase are 2 32 values for the 16 bytes in state 
#10 (green and black bytes). From the first inbound phase, we have obtained 
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one solution for 16 bytes in state #10 (gray and black bytes) as well. In these 
16 bytes, the values of the 4 active bytes (black) overlap between both inbound 
phases and the probability for a successful match is 2 -32 . Among the 2 32 results 
of the second inbound phase, we expect to find one solution to match the values 
of state #10. Once we have found a match, we can compute the values of the 
newly determined 12 bytes in state #7, marked by green bytes in Fig. £3 

5.6 Starting Points 

In this phase of the attack, we will compute a number of starting points which 
we will need for the subsequent steps. For each lane, we choose random values 
for the 12 bytes in state #7 (marked by brown bytes in Fig. 0J) and compute 
the corresponding 16-byte values in state #0. We repeat this step 2 64 times and 
store the results in the corresponding lists L' 0 , L 2 , L 4 or L’ h . 

5.7 Merge Lanes 

Next, we merge lists L ' 0 and Z/ 2 to get the list L' 02 , consisting of 2 128 values for 
the 32 newly determined bytes of (mo, mi, m 2 , m 3 ) (brown bytes of state #0 in 
lane Po and P 2 ). Further, we merge lists L 4 and L ' 5 to get the list L 45 of size 
2 128 containing the 32 byte values of (ao, ai, Co, ci). 

5.8 Message Expansion 

Finally, we satisfy the conditions of the message expansion on (a 0 , a\ ) using the 
values of (ho, hi), and use the two lists L ' 02 and L ' 45 to satisfy the conditions on 
(co,ci). Since we need to match 16 bytes of (co,ci) and have 2 128 elements in 
both lists, we expect 2 128 x 2 128 x 2 -128 = 2 128 matching pairs which we store 
in list L s . We will use these values in a later phase of the attack. 

5.9 Third Inbound Phase 

Now, we extend the truncated differential path by applying a third inbound 
phase between state #18 and state #23 for each active lane. Note that the 
values in 16 bytes of state #18 (black and green bytes), and the differences in 16 
bytes (1st AES-block) of state #20 (black bytes) have already been fixed due to 
the second inbound phase. Similar to the second inbound phase, we start with 
2 32 4-byte differences in state #23 and compute backwards to state #21 to get 
a match for the SubBytes layer. Since we have 2 32 starting differences, we expect 
to find 2 32 results for the third inbound phase, with fixed values and differences 
for the 16 bytes in state #15 (purple and black bytes). 

5.10 Merge Inbound Phases 

The values of the second and the third inbound phase overlap in 4 active bytes 
(black) of state #18. Since we have 2 32 results of the third inbound phase, we 
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expect to find one solution after merging the two phases. Once we have found 
a match, we can compute the values of the newly determined 12 bytes in state 
#15, marked by purple bytes in Fig. 01 Next, we need to connect all three 
inbound phases. For all possible 8-byte values of state #10 marked by red bytes, 
we compute the 16 corresponding bytes in state #15 (2nd AES-block). If the 
computed values satisfy the 4 bytes in state #15 marked by purple, we store 
the result of each lane in the corresponding lists Lfi, Ljj, L% and L\. In total, 
we obtain 2 64 ■ 2 -32 = 2 32 entries in each fist. We repeat the same for the bytes 
marked by blue and yellow, and generate the lists L\ and L\ for each of the 
active lanes with index i G (0,2, 4, 5}. For each lane, we merge the three lists 
L“, L\ and P? and store the 2 96 results in lists L*. Note that for each entry in 
these lists, we can determine all values and differences of the corresponding lane. 

5.11 Find Collisions 

In this phase of the attack, we finally search for a collision at the end of the 
P-lanes (Po, P2) and (P4, P5) using the elements of lists L*. To find a collision at 
the end of the P-lanes, we need to match the 16 byte differences in state #32 of 
the two corresponding active lanes such that A(P 0 (BP 2 ) = 0 and A(P^®Ph) = 0. 
Note that we can satisfy these conditions independently for each side (Po,P 2 ) 
and (P4, P5). Since we need to match 128 bits and we have 2 96 elements in each 
list L*, we expect to find 2 96 • 2 96 • 2 -128 = 2 64 collisions for each side. We store 
the corresponding inputs (do, asi, Co, ci) for the collisions between lane Po and 
P2 in list Lq 2 and the inputs (mo, mi, m 2 , m 3 ) for the collisions between lane P4 
and P5 in list P| 5 . 

5.12 Message Expansion 

Finally, we need to match the message expansion for the remaining 32 bytes 
of each side. Hence, we just repeat the same procedure as we did for the first 
part of state #0, except that we only need to match the values of 32 bytes 
but no differences. Again, we use the values of (ho, hi) to satisfy the conditions 
on (ao,ai) first. Then, we match the values of the 32 bytes in (co,ci). Since 
we only have 2 64 entries in both of L* yl and L\ h , the success probability for a 
match is 2 64 • 2 64 • 2 -256 = 2 -128 . However, we can still repeat from Sect. 15.61 
using a different starting point stored in fist L s . Since we have 2 128 elements in 
fist L s , we can repeat the previous steps up to 2 128 times. Hence, we expect to 
find one valid match for the message expansion and thus, a collision for the full 
compression function of Lane-512. 

5.13 Complexity 

The total complexity of the rebound attack on Lane- 512 is determined by 
the merging step after the third inbound phase. This step has a complexity 
of 2 96 compression function evaluations and is repeated 2 128 times. The memory 
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requirements are determined by the largest lists, which are L[ y2 and L' 45 (or L s ) 
with a size of 2 128 . Hence, the total complexity to find a semi-free-start collision 
for Lane-512 is about 2 128 • 2 96 = 2 224 compression function evaluations and 
2 128 in memory. 

6 Conclusion 

In this work, we have applied the rebound attack to the hash function Lane. 
In the attack we use a truncated differential path with differences concentrating 
mostly in one part of the lanes. Due to the relatively slow diffusion of parallel 
AES rounds, we are therefore able to solve parts of the lanes independently. 
First, we search for differences and values (for parts of the state) according to 
the truncated differential path and also satisfy the message expansion. Then, we 
choose values which can be changed such that the truncated differential path and 
according message expansion still holds. The freedom in these values is then used 
to search for a collision at the end of the lanes without violating the differential 
path or message expansion. 

In the rebound attack on Lane, we are able to construct semi-free-start col- 
lisions for full round Lane-224 and Lane-256 with 2 96 compression function 
evaluations and memory of 2 80 , and for full round Lane-512 with complexity of 
2 224 compression function evaluations and memory of 2 128 . Although these colli- 
sions on the compression function do not imply an attack on the hash functions, 
they violate the reduction proofs of Merkle and Damgard, or Andreeva in the 
case of Lane. However, due to the limited degrees of freedom, a collision attack 
on the hash function seems to be difficult for full round Lane. 
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Abstract. Whirlpool is a hash function based on a block cipher that 
can be seen as a scaled up variant of the AES. The main difference is the 
(compared to AES) extremely conservative key schedule. In this work, 
we present a distinguishing attack on the full compression function of 
Whirlpool. We obtain this result by improving the rebound attack on 
reduced Whirlpool with two new techniques. First, the inbound phase of 
the rebound attack is extended by up to two rounds using the available 
degrees of freedom of the key schedule. This results in a near-collision 
attack on 9.5 rounds of the compression function of Whirlpool with a 
complexity of 2 176 and negligible memory requirements. Second, we show 
how to turn this near-collision attack into a distinguishing attack for the 
full 10 round compression function of Whirlpool. This is the first result 
on the full Whirlpool compression function. 

Keywords: hash functions, cryptanalysis, near-collision, distinguisher. 

1 Introduction 

In the last few years the cryptanalysis of hash functions has become an important 
topic within the cryptographic community. Especially the collision attacks on the 
MD4 family of hash functions (MD4, MD5, SHA-1) have weakened the security 
assumptions of these commonly used hash functions |bl7H7l24l25l26| . Still, most 
of the existing cryptanalytic work has been published for this particular family 
of hash functions. Therefore, the analysis of alternative hash functions is of great 
interest. In this article, we will present a security analysis of the Whirlpool hash 
function with respect to collision resistance. 

Whirlpool is the only hash function standardized by ISO/IEC 10118-3:2004 
(since 2000) that does not follow the MD4 design strategy. Furthermore, it has 
been evaluated and approved by NESSIE (201 . Whirlpool is commonly considered 
to be a conservative block-cipher based design with an extremely conservative 
key schedule and follows the wide-trail design strategy (4|Sj . Since its proposal 
in 2000, only a few results have been published. 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 126- |l43,| 2009. 
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Table 1 . Summary of results for Whirlpool. Complexities are given in compression 
function evaluations, a memory unit refers to a state (512 bits). The complexities in 
brackets refer to modified attacks using a precomputed table taking 2 128 time/memory 
to set up. 


target 

rounds 

complexity 
runtime /memory 

type 

source 

block cipher W 

6 

2 ij!U / 2 ij!U 

distinguisher 

Knudsen jUJ 

hash function 

4.5 

2 i2U /2 v 

collision 


hash function 

6.5 

2 128 /2 7 

near-collision 

Mendel et al. 

compression function 

5.5 

2 120 /2 7 

collision 

FSE 2009 HE| 

compression function 

7.5 

2 128 /2 7 

near-collision 


hash function 

5.5 

2 im+s/ 2 m- s 

collision 

Appendix |AI 

hash function 

7.5 

2128+s ^2^4— s 

near-collision 

Appendix El 

compression function 

7.5 

2 184 /2 8 (2 120 /2 128 ) 

collision 

Sect. E| 

compression function 

9.5 

2 176 /2 8 (2 112 /2 128 ) 

near-collision 

Sect. 0 

compression function 

10 

2 188 /2 8 (2 121 /2 128 ) 

distinguisher 

Sect. 0 


Related Work. At FSE 2009, Mendel et al. proposed a new technique for 
the analysis of hash functions: the rebound attack jIS|. It can be applied to both 
block cipher based and permutation based constructions. The idea of the rebound 
attack is to divide an attack into two phases, an inbound and an outbound phase. 
In the inbound phase, degrees of freedom are used, such that in the outbound 
phase several rounds can be bypassed in both forward- and backwards direction. 
This led to successful attacks on round-reduced Whirlpool for up to 7.5 (out of 
10) rounds. The results are summarized in Table [D 

For the block cipher W that is implicitly used in the Whirlpool compression 
function, Knudsen described an integral distinguisher for 6 out of 10 rounds [TTJ . 
Furthermore, it is assumed that this property may extend also to 7 rounds. Note 
that in m similar techniques were used to obtain known-key distinguishers for 
7-rounds of the AES. 

Our Contribution. The main contribution of this paper is a distinguishing 
attack on the full compression function of Whirlpool which is achieved by im- 
proving upon the work of Mendel et al. in [HD in several ways. 

We start with a description of the hash function Whirlpool. Then, in Sect. El 
we give an overview of the rebound attack and show how it is applied to reduced 
versions of Whirlpool. In Sect. El we describe our improvement of the rebound 
attack on Whirlpool in detail. This technique enables us to add two rounds in 
the inbound phase of the attack and thus gives a collision and near-collision 
attack on the Whirlpool compression function reduced to 7.5 and 9.5 rounds, 
respectively. Based on this, we describe in Sect. 0a new generic attack and show 
how to distinguish the full (all 10 rounds) compression function of Whirlpool 
from a random function by turning the near-collision attack for 9.5 rounds into 
a distinguishing attack for 10 rounds. To the best of our knowledge this is the 
first result on the full Whirlpool compression function. Table 0 summarizes the 
previous results on Whirlpool as well as the contributions of this paper. 
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2 Description of Whirlpool 

Whirlpool is a cryptographic hash function designed by Barreto and Rijmen in 
2000 p. It is an iterative hash function based on the Merkle-Damgard design 
principle (c/. [T%|h It processes 512-bit message blocks and produces a 512-bit 
hash value. If the message length is not a multiple of 512, an unambiguous 
padding method is applied. For the description of the padding method we refer 
to p. Let M = Mi[| M 2 [| • • • \\M t be a t-block message (after padding). The hash 
value h= H ( M ) is computed as follows: 

H 0 = IV (1) 

Hj = W(Hj_ 1 , Mj) ® Hj u ffi Mj for 0 < j < t (2) 

h = H t (3) 

where IV is a predefined initial value and W is a 512 bit block cipher used in 
the Miyaguchi-Preneel mode [TB| • The block cipher W used by Whirlpool is very 
similar to the Advanced Encryption Standard (AES) na- 

The state update transformation and the key schedule update an 8 x 8 state 
S and K of 64 bytes in 10 rounds. In one round, the state is updated by the 
round transformation r, as follows: 

n = AK 0 MR 0 SC 0 SB. 

The round transformations are briefly described here: 

— the non-linear layer SubBytes (SB) applies an S-Box to each byte of the state 
independently. 

— the cyclical permutation ShiftColumns (SC) rotates the bytes of column j 
downwards by j positions. 

— the linear diffusion layer MixRows (MR) is a right-multiplication by the 8x8 
circulant MDS matrix cir(l, 1, 4, 1, 8, 5, 2, 9). 

— the key addition AddRoundKey (AK) adds the round key Kj to the 8x8 state, 
and AddConstant (AC) adds the round constant Cj to the 8x8 state of the 
key schedule. 

After the last round of the state update transformation, the initial value or 
previous chaining value Hj-i, the message block Mj, and the output value of 
the last round are combined (xored), resulting in the output of one iteration. A 
detailed description of the hash function is given in P . 

We denote the resulting state of round transformation n by Si and the in- 
termediate states after SubBytes by Sf B , after ShiftColumns by Sf c and af- 
ter MixRows by S) MR . The initial state prior to the first round is denoted by 
Sq = Mj © Ko- The same notation is used for the key schedule with round keys 
Ki with K 0 = Hj-i. 

3 The Rebound Attack 

The rebound attack is a new tool for the cryptanalysis of hash functions and 
was published by Mendel et al. in p]j. It is a differential attack. The main 
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idea is to use the available degrees of freedom in a collision attack to efficiently 
fulfill the low probability parts in the middle of a differential trail. The rebound 
attack consists of an inbound phase with a meet-in-the-middle part in order to 
exploit the available degrees of freedom, and a subsequent probabilistic outbound 
phase. AES based hash functions are a natural target for this attack, since their 
construction principle allows a simple application of the idea. 

3.1 Basic Attack Strategy 

In the rebound attack, the compression function, internal block cipher or permu- 
tation of a hash function is split into three sub-parts. Let IE be a block cipher, 
then W = Wfy, ° W in o W bw . 



Fig. 1. A schematic view of the rebound attack. The attack consists of an inbound and 
two outbound phases. 


The rebound attack can be described by two phases (see Fig. EJ: 

— Inbound phase: Is a meet-in-the-middle phase in Wi„, which is aided by 
the degrees of freedom that are available to a hash function cryptanalyst. 
This very efficient combination of meet-in-the-middle techniques with the 
exploitation of available degrees of freedom is called the match-in-the- 
middle approach. 

— Outbound phase: In the second phase, the matches of the inbound phase 
are computed in both forward- and backward direction through Wf w and 
W bw to obtain desired collisions or near-collisions. If the differential trail 
through Wf w and Wbw has a low probability, one has to repeat the inbound 
phase to obtain more starting points for the outbound phase. 

3.2 Preliminaries for the Rebound Attack on Whirlpool 

In the following, we want to briefly summarize some well known facts that will 
be frequently used in the subsequent sections. 

— Truncated differentials : Knudsen m proposed truncated differentials as a 
tool in block cipher cryptanalysis. In a standard differential attack (c/. ( 21 ), 
the full difference between two inputs/outputs is considered whereas in the 
case of truncated differentials, the differences is only partially determined, 
i.e. for every byte, we only check if there is a difference or not. A byte having 
a non-zero difference is called active. 


130 M. Lamberger et al. 


— Difference Propagation in MixRows: Since the MixRows operation is a linear 
transformation, standard differences propagate through MixRows in a deter- 
ministic way whereas truncated differences behave in a probabilistic way. 
The MDS property of the MixRows transformation ensures that the sum of 
the number of active input and output bytes is at least 9 (c/. Q). In general, 
the probability of any x — > y transition with 1 < x, y < 8 satisfying x + y > 9 
is approximately 2^ -8 )' 8 . For a detailed description of the propagation of 
truncated differences in MixRows we refer to (Eli, see also |?Tj . 

— Differential Properties of SubBytes: Let a, b G {0, l} 8 . For the Whirlpool 
S-box, we are interested in the number of solutions to the equation 

S{x)®S{x®a) = b. (4) 

Exhaustively counting over all 2 16 differentials shows that the number of 
solutions to (@I) can only be 0, 2, 4, 6, 8 and 256, which occur with frequency 
39655,20018,5043,740,79 and 1, respectively. The task to return all solu- 
tions x to © for a given differential (a, b) is best solved by setting up a 
precomputed table of size 256 X 256 which stores the solutions (if there are 
any) for each (a, b ). 

However, it is easy to see that for any permutation S (to be more precise, 
for any injective map) the expected number of solutions to © is always 
1. We get that 2 -16 Ylb #{ x I ^( x ® a ) © S( x ) = b} = 2 -16 2 s = 1, 
because for a fixed a, every solution x belongs to a unique b. Since the inputs 
to all the S-boxes are independent, the same reasoning is valid for the full 
SubBytes transformation. 

3.3 Application to Round-Reduced Whirlpool 

In this section, we will briefly describe the application of the rebound attack 
to the hash function Whirlpool. A detailed description of the attack is given 
in ra For a good understanding of our results, it is recommended to study 
these previous results on Whirlpool very carefully. 

The rebound attack on Whirlpool is a differential attack which uses a differ- 
ential trail with the minimum number of active S-boxes according to the wide 
trail design strategy. The core of the rebound attack on Whirlpool is a 4 round 
differential trail, where the fully active state is placed in the middle: 

1^8^64^8^1 

In the rebound attack, one first splits the block cipher W into three sub-ciphers 
W = Wfui ° Wi n ° Wbw, such that the most expensive part of the differential trail 
is covered by the inbound phase Wi n - In the inbound phase, the available degrees 
of freedom (in terms of actual values of the state) are used to guarantee that 
the differential trail in W ln holds. The differential trail in the outbound phase 
(Wf u ,, W bw ) is supposed to have a relatively high probability. While standard 
XOR differences are used in the inbound phase, truncated differentials are used 
in the outbound phase of the attack. 
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Fig. 2. A schematic view of the rebound attack on 4 rounds of Whirlpool with round 
key inputs. Black state bytes are active. 

In the following, we briefly describe the inbound and outbound phase of the 
rebound attack on 4 rounds of Whirlpool. For a more detailed description, we 
refer to the original paper [HU ■ 

Inbound Phase. In the first step of the inbound phase, we choose a random 
difference with 8 active bytes at the input of MixRows of round 7*2 (Sf 0 )- Note 
that we need an active byte in each row of the state (see Fig. EJ) to get a fully 
active state after the MixRows transformation. Since Add Round Key does not 
change the difference, we get a fully active state at the input of SubBytes of 
round (-S2). Then, we start with another difference in 8 active bytes at the 
output of MixRows of round r% (S , ^ 1R ) and propagate backwards. Again, since 
we have an active byte in each row, we get a fully active state at the output of 
SubBytes of round r3. 

In the second step of the inbound phase, the match-in-the-middle step, we 
look for a matching input/output difference of the SubBytes layer of round r 3. 
This is done as described in Sect. 13.21 with a precomputed 256 x 256 lookup 
table. Note that we can repeat the inbound phase at most about 2 128 times. As 
indicated in Sect. 13.21 we expect one solution per trial, that is, we can produce 
at most 2 128 actual values that follow the differential trail in the inbound phase. 

Outbound Phase. In contrast to the inbound phase, we use truncated dif- 
ferentials in the outbound phase of the attack. By propagating the matching 
differences and state values through the next SubBytes layer outwards, we get a 
truncated differential in 8 active bytes in both backward and forward direction. 
These truncated differentials need to propagate from 8 to 1 active byte through 
the MixRows transformation, both in the backward and forward direction (see 
Fig. 0). The propagation of truncated differentials through the MixRows trans- 
formation can be modelled in a probabilistic way, see Sect. 13.21 Since we need 
to fulfill one 8 — > 1 transitions in the backward and forward direction, the prob- 
ability of the outbound phase is 2 -2 ' 56 = 2 -112 . In other words, we have to 
repeat the inbound phase about 2 112 times to generate 2 112 starting points for 
the outbound phase of the attack. 

3.4 Previous Results on Round-Reduced Whirlpool 

Extending the 4 round trail in both, the inbound and outbound phase, leads 
to attacks on round reduced Whirlpool for up to 7.5 (out of 10) rounds (where 
0.5 rounds consist only of SubBytes and ShiftColumns). To be more precise, by 
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extending the outbound phase of the attack by 0.5 and 2.5 rounds, one can con- 
struct a collision and near-collision for the Whirlpool hash function reduced to 
4.5 and 6.5 rounds, respectively. The collision attack has a complexity of about 
2 120 and the near-collision attack has a complexity of about 2 128 . Furthermore, 
by additionally extending the inbound phase of the attack by 1 round, one can 
find a collision and a near-collision for the compression function of Whirlpool 
reduced to 5.5 and 7.5 rounds with a complexity of 2 120 and 2 128 , respectively. 
Note that adding this round in the inbound phase is possible, since in a com- 
pression function attack, one can use the degrees of freedom of the key schedule 
(chaining value) to guarantee that the trail in the inbound phase holds. All re- 
sults are summarized in Table 0 and for more details on these results we refer 
to fTT)| . 

4 Improved Rebound Attack on the Whirlpool 
Compression Function 

In this section, we improve the inbound phase of the original rebound attack on 
Whirlpool. By using a new differential trail and extensively using the available 
degrees of freedom of the key schedule, we can add 2 additional rounds to the 
inbound phase of the attacks. The basic idea is to have two instead of one inbound 
phase (match-in-the-middle step) and connect them using the available degrees 
of freedom from the key schedule. The outbound phase of the attacks is identical 
as in the previous attacks on 5.5 and 7.5 rounds for the compression function of 
Whirlpool. As a result, we obtain a collision and a near-collision attack for the 
compression function of Whirlpool reduced 7.5 and 9.5 rounds, respectively. 

4.1 Inbound Phase 

In this section, we describe the improved inbound phase of the attack in detail. 
We use the following sequence of active bytes: 

8^ 64 ^8^8^ 64 ^*8 

In order to find inputs following the differential of the inbound phase, we split 
it into two parts. In the first part, we apply the match-in-the-middle step with 
active bytes 8 — > 64 — > 8 twice in rounds 1-2 and 4-5. In the second part, we 
need to connect the resulting 8 active bytes and 64 (byte) values of the state 
between round 2 and 4 using the degrees of freedom we have in the choice of the 
round key values (see Fig. 0. 

Inbound Part 1. In this part of the inbound phase, we apply the match-in-the- 
middle step twice for rounds 1-2 and 4-5 (see Fig. 0), which can be summarized 
as follows: 

1. Precomputation: For the S-box, compute a 256 x 256 lookup table as 
described in Sect. 13.21 
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Fig. 3. The inbound phase of the attack 


2. Match-in-the-middle (rounds 1-2): 

(a) Start with 8 active bytes at the output of AddRoundKey in round r 2 (S 2 ) 
and propagate backward to the output of SubBytes in round ri (Sf 6 ). 

(b) Start with 8 active bytes at the input of MixRows in round n {Sf c ) 
and propagate forward to the input of SubBytes in round r2 (-Si). Note 
that we can compute forward and solve the following step for each row 
independently. 

(c) Connect the input and output of the S-boxes of round r 2 by choosing 
the actual values of the state Si, respectively Sf B , using the lookup 
table generated in the precomputation step. After repeating step (b) 
for each row about 2 8 times we expect to find a match for the 8 S- 
boxes and thus 2 s actual values (see Sect. 13.211 . Since we do this for all 
rows independently, we get about 2 64 actual values for the full state Si, 
respectively S| B , such that the trail holds. 

3. Match-in-the-middle (rounds 4-5): Do the same as in Step 2. 

Hence, we get 2 64 candidates for S| B and 2 64 candidates for S4 after the first 
part of the inbound phase of the attack with a complexity of about 2 9 round 
transformations. 

Inbound Part 2. In the second part of the inbound phase, we have to connect 
the 8 active bytes (64 (bit) conditions) as well as the actual values (512 (bit) 
conditions) of S| B and S4 by choosing the subkeys K 2 , K 3 and K 4 accordingly. 
Therefore, we have to solve the following equation: 

MR(SC(SB(MR(SC(SB(MR(SC(-Sf B )) © K 2 ))) 0 Jf 3 )! © K 4 = S 4 (5) 
K 3 = MR(SC(SB(tf 2 ))) © C 3 

K 4 = MR(SC(SB(A' 3 ))) © C*. U 

Since we have 2 64 candidates for 5| B , 2 64 candidates for S 4 and 2 512 candidates 
for the 3 subkeys K 2 , K 3 , K 4 (because of (0), we expect to find 2 64 solutions. 

Since S 2 mr = MR(SC(S , | b )), we can rewrite the above equation as follows: 

MR(SC(SB(MR(SC(SB(S 2 mr © K 2 ))) © K 3 ))) ®K 4 = S 4 (7) 

Note that one can always change the order of SC and SB in the Whirlpool 
block cipher without affecting the output of one round. In order to make the 
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subsequent description of the attack easier, we do this here and get the following 
equation. 

MR(SC(SB(MR(SB(SC(S 2 mr © K 2 ))) 0 K 3 ))) @K a = S a (8) 

Furthermore, MR and SC are linear transformations and hence we can rewrite 
the above equation as follows: 

SB(MR(SB(S'| © K 2 )) © K 3 ) © Kf B = X (9) 

with = SC(S 2 mr ), K% = SC (K 2 ), fC| B = SB(iC 3 ), X = SC~ x (MR -1 (5E»«MSt)). 

In the following, this equivalent description is used to connect the values and 
differences of the two states S 1 ^ 1R and S A . 



Fig. 4. The second part of the inbound phase. Black state bytes are active. 

Remember that the two 8-byte differences of S 2 and X have already been 
fixed due to the previous steps. Furthermore, we can choose from 2 64 * values for 
each of the states and X. Now, we use equation © to determine the subkey 
K 2 such that we get a solution for the inbound phase of the attack. Note that 
we can solve © for each row of the equation independently (see Fig. ©. It can 
be summarized as follows. 

1. Compute the 8-byte difference and the 2 64 * values of the state S 2 from 5| B , 
and compute the 8-byte difference and the 2 64 * values of the state X from 
S A . Note that we can compute and store the values of S 2 and X row-by-row 
and independently. Hence, both the complexity and memory requirements 
for this step are 2 8 instead of 2 64 . 

2. Repeat the following steps for all 2 64 values of the first row of S 2 to get 2 64 

matches for S 2 to S A : 

(a) For the chosen value of the first row of S 2 , forward compute the differ- 
ences and values to the first row of S', 3. 

(b) Choose the first row of the key K 3 such that the differential of the S-box 
between S3 and £f B holds. 

(c) Compute the first row of K 2 , S 2 , K A B and X. Since we have 2 64 values 
for the first row of S 2 and 2 64 values for the first row of X, we expect to 
find a match on both sides. In other words, we have now connected the 

values and differences of the first row. 

(d) Next, we connect the values of rows 2-8 independently by a simple brute- 
force search over all 2 64 corresponding key values of K 2 . Since we have 
to connect 64 bit values and we test 2 64 key values we expect to always 
find a solution. 
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In total, we get 2 64 matches connecting state S% to state X with a complexity of 
2 128 and memory requirements of 2 8 . In other words, with the values of S%, X and 
the corresponding key K $ , we get 2 64 starting points for the outbound phase of the 
attack. Hence, the average complexity to find one starting point for the outbound 
phase is 2 64 . It is important to note that one can construct a total of 2 192 starting 
points in the inbound phase to be used in the outbound phase of the attack. 

Note that step 2 (d) can be implemented using a precomputed lookup table 
of size 2 128 . In this lookup table each row of the key K 2 (64 bits) is saved for the 
corresponding two rows of S 2 and X (64 bits each). Using this lookup table, we 
can find one starting point for the outbound phase with an average complexity 
of 1. However, the complexity to generate this lookup table is 2 128 . 

4.2 Outbound Phase 

In the outbound phase of the attack, we further extend the differential path 
backward and forward. By propagating the matching differences and state values 
through the next SubBytes layer, we get a truncated differential in 8 active bytes 
for each direction. These truncated differentials need to follow a specific active 
byte pattern to result in a collision on 7.5 rounds and a near-collision on 9.5 
rounds, respectively. In the following, we will describe the outbound phase for 
the collision and near-collision attack in detail. 

Collision for 7.5 Rounds. By adding 1 round in the beginning and 1.5 rounds 
at the end of the trail, we get a collision for 7.5 rounds for the compression 
function of Whirlpool. In the attack, we use the following sequence of active 
bytes: 

lTU8^64^8^8^64^8^»l ^ 1 

As described in Sect. 13.21 the propagation of truncated differentials through 
the MixRows transformation is modelled in a probabilistic way. For the differ- 
ential trail to hold, we need that the truncated differentials in the outbound 
phase propagate from 8 to 1 active byte through the MixRows transformation, 
both in the backward and forward direction (see Fig. 0. Since the transition 
from 8 active bytes to 1 active byte through the MixRows transformation has a 
probability of about 2 -56 , the probability of this part of the outbound phase is 
2-2-56 _ 2 -112 . Furthermore, to construct a collision at the output (after the 
feed-forward), the exact value of the input and output difference has to match. 
Since only one byte is active (see Fig. 0, this can be fulfilled with a probability 



Fig. 5. Differential trail for collision attack on 7.5 rounds 
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of 2 -8 . Hence, the probability of the outbound phase is 2~ 112 ■ 2~ 8 = 2 ~ 120 . In 
other, words, we have to generate 2 120 starting points (for the outbound phase) 
in the inbound phase of the attack to find a collision for the compression function 
of Whirlpool reduced to 7.5 rounds. 

Since we can find one starting point with an average complexity of about 2 64 
and memory requirements of 2 8 , we can find a collision with a complexity of 
about 2 120+64 = 2 184 . The complexity of the attack can be further improved 
on the cost of higher memory requirements. By using a lookup table with 2 128 
entries (generated in a precomputation step), we can find one starting point for 
the inbound phase with an average complexity of 1. In other words, we can find 
a collision for the compression function reduced to 7.5 rounds with a complexity 
of about 2 120 . However, the precomputation step (constructing the lookup table) 
has a complexity of about 2 128 . 

Near-Collision for 9.5 Rounds. The collision attack on 7.5 rounds for the 
compression function can be further extended by adding one round at the begin- 
ning and one round at the end of the trail in the outbound phase. The result is 
a near-collision attack on 9.5 rounds for the compression function of Whirlpool 
with the following sequence of active bytes: 

g ^l^ gi ^ 6 4 Z 4 >g t S4 g I e + 64dI»8,2+iT£» 8 8 

Since the 1-byte difference at the beginning and end of the 7.5 round trail will 
always result in 8 active bytes after one MixRows transformation (see Sect, ft 'ill , 
we can go backward 1 round and forward 1 round with no additional cost. 
Using the feed-forward, the position of two active S-boxes match and cancel 
each other with a probability of 2 -16 . Hence, we get a collision in 50 and 52 
bytes for the compression function of Whirlpool with a complexity of about 2 176 
and 2 176+16 = 2 192 , respectively. With a precomputation step with complexity 
of 2 128 and similar memory requirement, one can find a near-collision for the 
compression function of Whirlpool with a complexity of about 2 112 (collision in 
50 bytes) and 2 128 (collision in 52 bytes), respectively. 



Fig. 6. In the attack on 9.5 rounds we extend the trail one more round at the beginning 
and at the end of the outbound phase to get a near-collision of Whirlpool 


5 A Subspace Distinguisher for 10 Rounds 

In this section, we present the first cryptanalytic result on the full Whirlpool 
compression function. The method for extending the previous result on 9.5 
rounds is extended to full 10 rounds of the compression function by defining 
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a different attack scenario. Instead of aiming for a near-collision, we are in- 
terested in distinguishing the Whirlpool compression function from a random 
function. For this, we will introduce a new kind of distinguishing attack, a so 
called subspace distinguisher. In the following, F 2 = GF(2) always denotes the 
finite field of order 2. 

For the subspace distinguishing attack, we consider the following problem: 
Problem 1 . Given a function f mapping to , try to find t input pairs such 
that the corresponding output differences belong to a vector space of dimension 
at most n for some n< N . 

Remark. We define Problem Q in this generic way in order to make it more 
generally applicable. This will be shown in the extended version of this paper. 

5.1 Solving Problem [D for the Whirlpool Compression Function 

In this section, we show how the compression function attack described in Sect.BI 
can be used to distinguish the full Whirlpool compression function from a ran- 
dom function. 

Obviously, the difference between two Whirlpool states can be seen as a vector 
in the vector space of dimension N = 512 over F 2 . The crucial observation is 
that the attack of Sect. 0] can be interpreted as an algorithm that can find t 
difference vectors in F:] 12 (output differences of the compression function) that 
form a vector space of dimension n < 128. 

To see this, observe that by extending the differential trail from 9.5 to 10 
rounds, the 8 active bytes in Sffi will always result in a fully active state Sio 
due to the properties of the MixRows transformation. Thus the near-collision is 
destroyed. However, if we look again at Fig. El the differences in M t and the 
differences in Sf q can be seen as (difference) vectors belonging to subspaces of 
Fg 12 of dimension at most 64. 

Even though after the application of MixRows and AddRoundKey the state Sio 
is fully active in terms of truncated differentials, the differences in Sio still belong 
to a subspace of F| 12 of dimension at most 64 due to the properties of MixRows. 
Therefore, after the feed-forward, we can conclude that the differences at the out- 
put of the compression function form a subspace of F^ 12 of dimension n < 128. 

Hence, we can use the attack of Sect. 0 to find t difference vectors forming a 
vector space of dimension n < 128 with a complexity of t ■ 2 176 or t ■ 2 112 using 
a precomputation step with complexity 2 128 . Note that t < 2 192-112 = 2 80 due 
to the remaining degrees of freedom in the inbound phase of the attack. 

Now the main question is for which values of t our attack is more efficient 
than a generic attack. In other words, how do we have to choose t such that we 
can distinguish the compression function of Whirlpool from a random function. 
Therefore, we first have to bound the complexity of the generic attack. This is 
described in the next section. 

5.2 Solving Problem [flfor a Random Function 

Remarks on the Security Model. In order to discuss generic attack sce- 
narios, we will have to choose a security model. We will adopt the black box 
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model introduced by Shannon m. In this model, a block cipher can be seen 
as a family of functions parameterized by the secret key k £ /C, that is, E : 
{0, l}l fe l X {0, 1}^ i-> {0, 1} JV , where for each k £ JC, Ek is seen as a uniformly 
chosen random permutation on {0, ljW. 

In P] it was shown, that an ideal block cipher based hash function in the 
Miyaguchi-Preneel mode is collision resistant and non- invertible. Based on this, 
we model our compression function / as black box oracle to which only forward 
queries are admissible. We also want to note that in all of the following, when 
we are talking about complexity, we are talking about query complexity. Note 
that the practical complexity is always greater or equal to the query complexity. 


The Generic Approach. In this generic approach the only property used 
about / is the fact that the outputs of / are contained in the vector space F^. 

Let us now assume that an adversary is making Q queries to the function /. 
Assuming that Q -C 2 iV/ ' 2 , we thus get K = (®) differences (e F^) coming from 
these Q queries. For given n and t^$> n, we now want to calculate the probability 
that among these K difference vectors, we have t vectors that span a space of 
dimension less or equal to n. 

We will need the following fact about matrices over finite fields. Let E(t, N, d ) 
denote the number of t X N matrices over F2 that have rank equal to d. Then, 
it is well known (c/. j0| or [El) that 


Eft, N, d) = JJ(2 Ar -2 i 


•G),-S 


( 2 ^ - 2 *) • ( 2 * - 2 j ) 


(10) 


where (^) 2 denotes the 5-binomial coefficient with q = 2. 

Proposition 1. Let n,t,N £ N be given such that t N > n. We assume a 
set of K vectors chosen uniformly at random from . Let Pr (K, t , N, n) denote 
the probability that t of these K vectors span a space of dimension not larger 
than n. Then, we have 


Pr (K, t, N, n) 


{^2^±Eit,N,d) 


< 1 ( 2 -(N-n)(t- n)-(n-l) 

“ y/tod V t ) 


( 11 ) 

(12) 


Proof. Based on the definition of Eft, N , d), it is easy to see that (1171) is an upper 
bound for Pr (K, t, N, n). 

Computing the second bound consists of two steps. Bounding the binomial 
coefficient and bounding the rest. We get 


2~ l ' N ^ E(t, N, d) < 2~ t N ■ 2 • E(t, N, r 


( ( 2 * — 2 n ~ 1 ) ■ ( 2 n — 2 Tt ~ 1 )^ 


(13) 


(14) 
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< 2~ t ' N+1 (2"- 1 • 2*-<»-i) ■ (15) 

= 2 -(t-nm-n)-(n-l)_ ( 16) 

These inequalities are based on two facts. First, it is easy to show that for 
t N > n, we have Eft, N, n) < 1 -®(£> N,d) <2 ■ Eft, N, n ). This can be 

proven by using induction over n and elementary properties of the ^-binomial 
coefficient. Second, dm follows from the fact that the function defined by f(x ) = 
(2* — x)(2 n — x) / (2 n — x) is strictly increasing on the interval x G [0, 2 n ~ 1 ]. 

For the binomial coefficient (^) we combine the simple estimate (^') < K* fi\ 
with the following inequahty based on Stirling’s formula m : 

V27rt t+ ^e“ 4+ T2f+T < t\ < V2^t t+ ^ e~ t+ ^i (17) 

From this we get (^) < (-^r 2 ) and with ( fTfUl . this proves the proposition. ■ 

As a corollary, we can give a lower bound for the number of random vectors 
needed to fulfill the conditions of the proposition with a certain probability. 

Corollary 1. For a given probability p and given N,n,t as in Proposition 0 
the number K of random vectors needed to contain t vectors spanning a space of 
dimension not larger than n with a probability p is lower bounded by 


> i (pV2rt) 4 • 


(N-n)(t-n)+(n-l) 


(18) 


and the number of queries Q to f needed to produce t vectors spanning a space 
of dimension not larger than n with a probability p is lower bounded by 




Vt- 2 


(jy- n )(t- w ) + (n- 1) 


(19) 


Proof. Equation (ITO follows immediately from dm and dm follows from setting 
K = (2) = Q(Q — l)/2 in (HB|). ■ 


5.3 Complexity of the Distinguishing Attack 

Table El shows the complexities of the generic approach and our dedicated ap- 
proach for several values of t. As can be seen in the table, one can distinguish the 
full Whirlpool compression function from random with a complexity of about 
2 188 with t = 2 12 (or 2 121 with t = 2 9 using a precomputation table). In other 
words, when performing 2 188 queries to a random function (ITTTj) shows that the 
probability for solving Problemdfor t = 2 12 is <C 1. To the best of our knowledge 
this is the first result on the full Whirlpool compression function. 
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Table 2. Values for t, Q (query complexity), C (complexity of our attack), and 
C p (complexity of our attack with precomputation) for p = 1, N = 512, n = 128 


log 2 (t) 

log 2 (Q) 

log 2 (C) 

l°g 2 (C p ) 

log 2 (t) 

log 2 (Q) 

log 2 (C) 

log 2 (C p ) 

9 

148.41 

185 

121 

13 

195.29 

189 

125 

10 

172.84 

186 

122 

14 

197.28 

190 

126 

11 

185.31 

187 

123 

15 

198.53 

191 

127 

12 

191.80 

188 

124 

16 

199.40 

192 

128 


6 Conclusion 

In this paper, we have proposed a new kind of distinguishing attack for cryptanal- 
ysis of hash functions. We have successfully attacked the Whirlpool compression 
function. To the best of our knowledge this is the first attack on full Whirlpool. 

We have obtained this result by improving the rebound attack on reduced 
Whirlpool. First, the inbound phase of the rebound attack was extended by up 
to two rounds using the available degrees of freedom from the key schedule. This 
resulted in a near-collision attack on 9.5 rounds of the compression function 
of Whirlpool. Second, we have shown how to turn this rebound near-collision 
attack into a distinguishing attack for the full 10 round compression function of 
Whirlpool. 

The idea seems applicable to a wider range of hash function constructions. 
In particular, the attacks described in this paper can be applied to the hash 
function Maelstrom jH] in a straight forward manner because of the similarity to 
Whirlpool (see also f I fij i . Several SHA-3 candidates are a natural target for this 
new kind of attack, see for instance f 1 411 5j . Furthermore, subspace distinguishers 
can be applied to block ciphers as well. This will be discussed in an extended 
version of this paper. 
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A Attacks on the Hash Function 

In this section, we present a collision and near-collision for the Whirlpool hash 
function. The attacks are a straight forward extension of the collision and near- 
collision attack on 4.5 and 6.5 rounds of Whirlpool presented in eg. By adding 
one round in the inbound phase we can find a collision and a near-collision for 
Whirlpool reduced to 5.5 and 7.5 rounds, respectively. The core of the attack is a 
5 round differential trail, where two fully active states are placed in the middle: 

1 8 — *■ 64 -^4 64 8 1 

Since the outbound phase of the attacks is identical to the previous attacks (see 
Sect. EJ), we only discuss the inbound phase of the attack here (see Fig. d - 



Fig. 7. The inbound phase of the collision attack and near-collision attack on the hash 
function 

It can be summarized as follows. 

1. Precomputation: For the S-box, compute a 256 x 256 lookup table as de- 
scribed in Sect. 13.21 

2. Start with 8 active bytes (differences) at the input of MixRows in round r 2 
(Sf 0 ) and propagate forward to the input of SubBytes in round r 3 [S 2 ). 

3. Start with 8 active bytes at the output of MixRows in round r 4 (S'^ /IR ) and 
propagate backward to the output of SubBytes in round r 4 (S'! 8 ). 

4. Next we have to connect the states S 2 and S'! 8 such that the differential trail 
holds. In other words, we have to find the actual values for S 2 such that: 

SB(MR(SC(SB(5 2 ))) © K 3 ) 0 SB(MR(SC(SB(5 2 0 Ai))) 0 K 3 ) = A 2 

where Ai denotes the active bytes (differences) in S 2 and A 2 denotes the 
active bytes (differences) in Sf 8 . In the following, we will show how this 
equation can be solved with a complexity of about 2 64 by solving the equation 
for sets of 8 bytes independently. It can be summarized as follows. 
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(a) For all 2 64 values of S^O, 0], S^l, 7], . . . , £2 [7, 1] compute the first row of 
S BB and check if the above equation holds. Note that due to ShiftColumns, 
these bytes are shifted to the first row of S BC and MixRows works on each 
row independently. In other words, we get 2 64 candidates for each row 
of S'! 8 . Hence, after testing all 2 64 candidates for the first row of S4 8 we 
expect to find a match for the first row of A 2 . 

(b) Do the same for the corresponding 8 bytes for row 2-8 of Sf 8 . 

After testing each set of 8 bytes independently, we will find a state £2 such 
that the differential trail is connected. Finishing this step of the attack has 
a complexity of about 8 • 2 64 MixRows (« 2 64 round computations). 

Hence, we can compute one starting point for the outbound phase with a com- 
plexity of about 2 64 . Note that the complexity of the inbound phase can be sig- 
nificantly reduced at the cost of higher memory requirements. By saving 2 64_s 
candidates for Sf B in a list, we can do a standard time/memory tradeoff with a 
complexity of about 2 120+s and memory requirements of 2 64_s . By setting s = 0 
we can find 2 64 starting points with a complexity of 2 64 and similar memory 
requirements of 2 64 . 

Hence, we can find a collision for Whirlpool reduced to 5.5 rounds with a 
complexity of about 2 120 and a near-collision for 7.5 rounds in 50 (respectively 
52) bytes with a complexity of about 2 120 and 2 112 (respectively 2 128 ). All attacks 
have memory requirements of 2 64 . 
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Abstract. We consider a long standing problem in cryptanalysis: at- 
tacks on hash function combiners. In this paper, we propose the first 
attack that allows collision attacks on combiners with a runtime below 
the birthday-bound of the smaller compression function. This answers 
an open question by Joux posed in 2004. 

As a concrete example we give such an attack on combiners with the 
widely used hash function MD5. The cryptanalytic technique we use 
combines a partial birthday phase with a differential inside-out tech- 
nique, and may be of independent interest. This potentially reduces the 
effort for a collision attack on a combiner like MD5||SHA-1 for the first 

Keywords: hash functions, cryptanalysis, MD5, combiner, differential. 

1 Introduction 

The recent spur of cryptanalytic results on popular hash functions like MD5 
and SHA-1 J28i:il)i;f l| suggests that they are (much) weaker than originally an- 
ticipated, especially with respect to collision resistance. It seems non-trivial to 
propose a concrete hash function which inspires long term confidence. Even more 
so as we seem unable to construct collision resistant primitives from potentially 
simpler primitives m Hence constructions that allow to hedge bets, like con- 
catenated combiners, are of great interest. Before we give a preview of our results 
in the following, we will first review work on combiners. 

Review of work on combiners. The goal of combiners is to have at least some 
bound on the expected security even if (some of the) hash functions get broken, 
for various definitions of “security” and “broken”. Joux m showed (by using 
multi-collisions) that the collision resistance of a combiner can not be expected 
to be much higher than the birthday bound of the component (=hash function) 
with the largest output size. 

On the other hand, combiners seem to be very robust when it comes to collision 
security up to the birthday bound (of the component with the smallest output 
size): By using techniques similar to Coron et al. 0, Hoch and Shamir m 
showed that only very mild assumptions on a compression function are needed to 
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achieve a collision resistance of at least 0(2"/ 2 ). In fact, using a model proposed 
by Liskov B5I. they show that none of the compression functions need to be 
collision, nor preimage resistant in the usual sense. 

Motivation: cryptanalysis of combiners. Concatenating the output of hash 
function is often used by implementors to “hedge bets” on hash functions. A 
combiner of the form MD5||SHA-1 as used in SSL 3.0/TLS 1.0 and TLS 1.1 j7IHj 
is an example of such a strategy. Let’s assume we are given a combiner of the 
form MD5||SHA-1. Let’s further assume that a breakthrough in cryptanalysis of 
SHA-1 brings down the complexity of a collision search attack to 2 52 . We know 
that the best collision search attacks on MD5 are as fast as 2 15 m- So what is 
the best collision attack on the combiner? The best known method due to Joux 
is only as good as a birthday attack on the smaller of the two hash functions in 
the combiner. There is no known method which would allow to reduce the total 
effort below this bound, i.e. 2 64 : 

Currently, the best solution at our disposal is to combine the (hypothetic) 
SHA-1 attack with Joux’s multicollision approach. Find a 2 64 -multicollision for 
SHA-1 with effort 2 52 • 64 = 2 58 , and then perform a birthday-type search in 
this 2 64 collision to single out a collision which also collides for MD5. The total 
effort will be 2 64 . In fact, reductions of the effort for SHA-1 collision search will 
only marginally improve the attack on the combiner. How to improve upon this? 
Analyzing the combiner as a whole may by prohibitively complicated. The resis- 
tance of two-pipe designs with sufficiently different pipes like RIPEMD-160 f 1 01 
against recent collision search attacks also gives hints in this direction. 

Preview of our results: We propose a new method that allows a cryptanalyst 
to focus on the hash functions individually while still potentially allowing attacks 
on combiners with a runtime below the birthday-bound of the smaller compres- 
sion function. This also answers an open question by Joux posed in 2004 m, 
For this, we start with definitions in Section 0 In Section 0 we give a high-level 
description of our attack strategy on a concatenation combiner without going 
into the details of a particular compression function. Next, we consider as a 
concrete cryptanalytic example combiners that use MD5. We first give an alter- 
native description of MD5 in Section 0 which will turn out to be beneficial (and 
in fact as our experiments suggest necessary) in Section 0 where we describe 
the cryptanalytic techniques we need, to be able to use the high-level attack 
description. 

For the cryptanalysis, we employ a combination of a birthday-style attack and 
a differential inside-out technique that uses different parts of a collision charac- 
teristic at different stages of an attack, both before and after a birthday phase. 
The differential technique may be of independent interest, also for improving 
known types of collision attacks on MD5, or for finding one-block collisions. In 
Section 0 we give practical results which allow us to estimate the actual secu- 
rity MD5 is able to give in a combiner. Finally, we conclude and discuss open 
problem in Section Cl 
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2 Definitions 

In the reminder of the paper we give a few definitions. We give a classification 
of collision attacks on compression functions and hash functions. Let an iterated 
hash function F be built by iterating a compression function / : {0, l} m X 
{0, 1}" — » {0, 1}" as follows: 

— Split the message M of arbitrary length into k blocks Xi of size m. 

— Set ho to a pre-specified IV 

— Compute Vxi : hi = , Xi) 

— Output F(M) = hk 

Classification for compression function collision attacks. Higher numbers mean 
less degrees of freedom for an attacker and are hence more difficult to obtain 
cryptanalytically. 

— Compression collision attacks of type 0 

Compute hi- 1 , h*_ 1} rrn and to* s. t. /(/ij_i,TOj) = f(h*_ 1 ,m*). Note that 
early attacks by den Boer and Bosselaers [0 , and Dobbertin jHJ on MD5 are 
of this type. 

Compression collision attacks of type 1 

Given hi- 1 , compute m* and m* s. t. /(hj_i,TOj) = f(h*_ 1 ,m*). 

— Compression collision attacks of type 2 

Given hj_i and h*_ l5 compute to* and to* s. t. /(/ij_i,mj) = /(/i*_ 1 ; to*) 

— Compression collision attacks of type 3 

Given hi - 1 and /i*_ l5 compute to* s. t. f(hi-i,rrii) = f(h*_ 1 ,mi) 

Later in the paper, it will be useful to have a weakened version of the collision 
attack on the compression function of type 3. 

— Compression collision attacks of type 3w 

Given hi - 1 and h*_ 1 from an efficiently enumerable subset s (of size |s| = 
2 n ~ z ) of all 2 2n possible pairs (hi - compute m* s. t. f(hi-i ■ mi) = 
f(h*-i,mi). 

Complementing types 1-3 of the compression function attacks, one may define 
similar attack settings for the hash function as well. For sake of concreteness, we 
also give examples related to MD5. 

— Hash collision attacks of type 1: Given too, compute TOi and m\ such 
that F(too||toi) = F (too 1 1 m*). This is the most simple way to violate the 
collision resistance of a hash function. For MD5, see Wang et al. |^Tj. The 
prefix mo may be the string of length 0, or any other message block. 

— Hash collision attacks of type 2: Given too and TOq, compute mi and 
m* such that F(too||toi) = F(toq| | m*). This type of attack is much more 
demanding from a cryptanalytic view as it needs to cope with arbitrary 
prefixes and hence arbitrary chaining input differences (Stevens et al. EHj). 
In turn it allows much more powerful attacks, as can be seen by the recent 
attacks on certificate authorities using MD5 M- 
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Hash collision attacks of type 3 (new, in this paper): Given mo and 
mg, compute mi such that F(mo||mi) = F(mo||mi). This type of attack 
is in turn much more difficult than type 2, as it halves the degrees of free- 
dom available to an attacker. The message difference is fixed (to zero), this 
means that for each MD5 compression function, instead of 1024 degrees of 
freedom, only 512 degrees of freedom via the message input are available to 
an attacker. 

This leads us to the informal definition of a weak hash function, complementing 
the concept of a weak compression function from m, A weak hash function 
may be modeled as a random oracle, but offers additionally oracles that allow 
collision attacks on the hash function of type 1 and type 2, but not of type 3. 
The purpose of this introduction of a weak hash function is to show that MD5 
can not even meet the requirements of a weak hash function, even though no 
type 3 collision attack on the MD5 compression function are known. 

We may define the security of a hash function as a component in a con- 
catenated combiner against collision attacks (concatenated combiner collision 
security, or simply C 3 security) of an n-bit hash function as the effort to find 
a collision attack of type 3. For MD5, despite all cryptanalytic advances in re- 
cent years, this is 2 64 . In this paper, we show an attack suggesting that the C 3 
security of MD5 is less. 

3 Outline of Attack Strategies 

In the following we assume it is possible to devise collision attacks of type 3w 
on the compression function below the birthday bound. These collision attacks 
will need a suitable differential path, and a method to find message pair which 
conforms to such a differential path. We will discuss this problem for the case of 
MD5 in Section 0 This alone is not enough for our attack to work, but based 
on such a result we propose to continue as follows. We first show how to devise 
a collision attack of type 3 on a hash function using a combination of birthday 
techniques and differential shortcut techniques. Then we continue and apply such 
an attack on a combiner. 

3.1 Collision Attack of Type 3 

The attack we propose (see Fig. Qfor an illustration) consists of three phases. A 
preparation phase that computes target differences (1), a birthday phase (using 
Mi) (2) and a differential phase (using M 2 ) that performs a type 3w collision 
attack (3), and is executed in this order. 

Before the birthday phase (2), the differential phase needs to be “prepared” 
as follows (1). We generate a number of 2 X distinct characteristics (also called 
paths) through the compression function on a heap with the following property: 
no message difference, an arbitrary input difference (£ 2 ), and no output difference 
($2 ES 63 = 0). Let’s assume each of them, when given a suitable chaining input 
pair, results in an effort of 2 W (or less) to find a conforming message pair. Let 2 V 
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Type 3w Collision 



]j]| Target Differences 


Fig. 1 . Outline of attack strategy 


be the cost of this path generation in terms of equivalent compression function 
computations. Let’s further assume that each of these paths has an average 
number of 2 independent conditions on the chaining input (Cl). 

A single path with 2 conditions on the Cl in fact can be used for 2 n ~ z possible 
pairs of CIs. Since there exist 2 2 " pairs, 2" +z randomly generated pairs would 
be needed before one matches the Cl described by the path (<Ji matches 62, and 
the conditions are fulfilled). Using birthday techniques, this is expected to take 
2 (n+z)/2 y me Given all 2 X paths, only 2 n+2 ~ x randomly generated pairs are 
needed, which in turn is expected to take 2^ n ~ x+z ^ 2 time. Hence, if x > z, the 
runtime is expected to be below the birthday bound. 

For obtaining a single hash collision of type 3, the overall method may be 
seen as a successful cryptanalytic attack, if the sum of the runtimes for the path 
generation, the birthday phase, and the work to find a conforming message pair 
using a particular path is below the birthday bound, i.e. if 2 V + 2(" _x + z )/ 2 + 
2 W < 2"/ 2 . For obtaining many hash collisions of type 3, the effort to generate 
the heap of paths (1) may be negligible, hence to goal would be reduced to 
2 (n— *+*)/ a + 2 W < 2"/ 2 . 

3.2 Attack on the Combiner F 1 (M)\\F 2 (M) 

We now discuss how to use a type 3 collision attack on a hash to devise an 
attack on a combiner of two hash functions using it, where the first of two hash 
functions suffers from a type 1 collision attack. 

The setting: Let Fi(-) and F 2 (-) be two hash functions with output size ni 
and n-2- For the sake of simplicity we assume in the following that n\ = n 2 = n. 
Let’s further assume that Fi suffers from a type 1 collision attack, i.e. given 
mo, let the effort to find a mi and m\ such that Fi(mo||mi) = Fi(mo||mi) be 
2 Cl < 2”/ 2 . Furthermore, assume that F 2 suffers from a type 3 collision attack, 
i.e. given m 2 and m| , compute m3 such that F 2 (rri2 1 |ma) = F 2 (m?j \ \ m3 ) be 
2 ° 2 < 2"/ 2 . In more detail, as noted above, 2 n+z ~ x randomly generated pairs 
(m2, m* 2 ) are needed. The introduced symbols are summarized in Tabled 

We are now ready to formulate the new collision attack on the combiner 
Fi(M)\\F 2 (M) that combines both attacks. It is also illustrated in Fig. 0 
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Table 1. Symbols used in the description of the attack 



F.(M) || F 2 (M) 



(a) The known approach due to Joux 
does not allow to exploit shortcut 
collision attacks on both hash func- 
tions. The lower bound is hence a 
birthday attack on the “smaller” 
hash function. 


FffM) || F 2 (M) 



(b) New collision construction using 
type 3 collisions allows to exploit 
shortcuts attacks in both hash func- 
tions without considering the inter- 
action in the cryptanalysis. 


Fig. 2. Comparison of collision attack on a combiner 


1 . Let mo be the string of size zero and perform the type 1 collision attack on 
F 1 and obtain a ( m{,m { *) such that Fi(m{) = Fi(m}*). Note that F 2 (m}) 
does not collide with F 2 (to-}*). 

2. Repeat the step above while replacing mo with the concatenation of all 
previously found messages (n + z — x)/ 2 — 1 times. This means, for the *-th 
step (for i = 2 . . . (n+z—x)/ 2), let too = m\\\ . . . \\m\ and obtain a 

such that F\{m\) = F\{m l i). 

3. Note that by using Joux’s multicollision method, we have produced a 

2 (n+z-x)/ 2 - C0 iiisi 0I1 f or p i 

4. Perform the type 3 attack of F 2 as follows. For the birthday-part of the type 
3 attacks, use the (n+z-x)/2 collisions in Fi to obtain the required 2 n+z ~ x 
pairs of prefixes to 2 and m 2 . 

5. Continue with the differential shortcut part of the type 3 attack as outlined 
in the previous subsection, i.e. find a suffix m 3 such that there is a collision 
between 

F 2 (m\\\m{\\ . . . ||to^ +z_x)/2 ||to 3 ) 
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F 2 (ml*||m?*||...||m« n+z - x)/2) *||m 3 ). 

6. Also, the collision in Fi remains. 

F 1 K|K||...||m^ +z -^ 2 ]|m 3 ) 

collides with 

Fi(mJ*||mf || . . . ||m^ ( " +z_x)/2) *||m 3 ), 

as after the multicollision the message block m 3 without a difference is added. 

7. As the same message constitutes a collision for both F\ and F 2 , this in turn 
results in a collision for the combiner. 

The computational complexity of this procedure is as follows. The type 1 collision 
search on F\ in step 1 is repeated (n+z — x)/ 2 times, which sums up to an effort 
of (n + z — x) /2 ■ 2 Cl . Afterwards the type 3 collision search in F 2 is performed 
using the obtained multicollision. This consists of a birthday part and a type 3w 
compression function attack, in total costing 2 C2 computations. Hence, the total 
complexity is (n + z — x) ■ 2 Cl_1 + 2 C2 , and reusing the calculation for c 2 from 
Section El we arrive at 

(n+z-x) ■2 Cl ~ i ^ r 2 y + 2^- x+z ^ 2 + 2 w . (1) 


4 Alternative Description of MD5 

MD5 is an iterative hash function based on the Merkle-Damgard design princi- 
ple |4I19| . It processes 512-bit input message blocks and produces a 128-bit hash 
value. If the message length is not a multiple of 512, an unambiguous padding 
method is applied. For the description of the padding method we refer to m- 
The design of MD5 is similar to the design principles of MD4 PI- In the follow- 
ing, we briefly describe the compression function of MD5. It basically consists 
of two parts: message expansion and state update transformation. A detailed 
description of the MD5 hash function is given in El- 

4.1 Message Expansion 

The message expansion of MD5 is a permutation of the 16 message words rrii 
in each round. For each of the four rounds, a permutation of these 16 message 
words is used, resulting in 64 32-bit words, denoted by Wi, with 0 < i < 63. For 
the permutation defining the ordering of message words we refer to PI- 

4.2 State Update Transformation 

The state update transformation of MD5 starts from a (fixed) initial value IV 
(A_ 4 , A_ 3 , A_ 2 , A_i) of four 32-bit registers and updates them in 4 rounds of 
16 steps each. The state update transformation of MD5 works on four state 
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variables. The state update transformation can be written to update one variable 
only: 

Ai = Ai-i + (A *_ 4 + /(Aj !, A-i 2 . A-i 3 ) + Wi + Ki) <§£ s*. 

However, in our case it turned out that a description which updates 2 state 
variables Ai and Bi is beneficial. In this case, one step is computed as follows 
(see also Fig. 0): 

Bi = (Ai _ 4 + /(Aj_ 1; Aj_ 2 , Aj_ 3 ) + Wi + Ki) 

A i =A i _ 1 + B i . 

In each step of MD5, different step constants JQ, rotation values Si and Boolean 
functions / are used. For the definition of the constants and the rotation values 
we refer to m- The Boolean function / differs for each round of MD5: IF is 
used in the first round, IF3 is used in round 2, and XOR is used in round 3 and 
ONX is used in the last round: 

IF(a;, y, z) = xy ® ~>xz 
IF3 (x,y,z) = zx ® ->zy 
X0R(a:, y, z) = x®y® z 
0MX(a:, y, z) = y ® (x V -^z) 

After the last step of the state update transformation, the initial value and the 
output values of the last four step are combined, resulting in the final value 
of one iteration known as Davies-Meyer hash construction (feed forward). The 
result is the final hash value or the initial value for the next message block. 



Fig. 3. Alternative description of the step update transformation of MD5 using two 
state variables Ai and Bi 
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5 Path Search Technique for MD5 Type 3 Collisions 

We now tackle the problem of finding collision attacks on the compression function 
of MD5 of type 3w. Various automated path search techniques for MD4-like hash 
functions have been proposed in the past. In this section, we describe the new path 
search technique we developed to solve the problem. In fact it can be seen as a 
variation of the fine grained condition propagation originally proposed in jOj . 

5.1 Overview 

As illustrated in Fig. 0 the MSB-path of P is a building block of our technique. 
Starting from this MSB-path in the middle of the compression function we will 
study and search for many characteristics which propagate through the ONX 
round in the forward direction, and through the IF round in the backward direc- 
tion in a non-linear way. The constraint is that, despite different rotation values 
and Boolean functions, resulting differences in both ends of the state update will 
cancel out after the feed-forward operation. 





Fig. 4. The outline of the type 3w collision search with IF-path, MSB-path and ONX 
path 

5.2 Reviewing the Path Search of De Canniere/Rechberger 

In 2006, De Canniere/Rechberger jO] propose the concept of generalized condi- 
tions. The generalized conditions on a particular pair of words will be denoted 
by VX. VX represents as a set the values for which the conditions are satisfied. 
In order to write this in a compact way, we will reuse the notation listed in 
Table El 

In |Sj, the authors describe a heuristic method to find complex nonlinear 
characteristics for SHA-1 in an efficient way. Follow-up work directly applied 
this method in various settings in the context of SHA-0 and SHA-1 j5ll3ll6lf2| . 
The approach may be described as follows. 

1 . The starting point is a number of constraints (on the message difference and 
some target differences in the state) for the characteristic. 

2. The basic idea of the algorithm is to randomly pick a bit position which is 
not restricted yet (i.e. , a ‘?’-bit), impose a zero-difference at this position (a 
‘-’-bit), and calculate how the condition propagates. This is repeated until all 
unrestricted bits have been eliminated, or until it runs into an inconsistency, 
in which case it starts again. 
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Table 2. Notation for generalized conditions, possible conditions on a pair of bits. The 
right half is for completeness only, and will not be used in the paper. 



3. The basic idea was improved by also sometimes picking ‘x’-bits once they 
start to appear, guessing the sign of their differences (‘u’ or ‘n’), and doing 
a backtracking if this does not lead to a solution. 

5.3 The Path Search for MD5 

We found that a direct mapping of this strategy to the case of MD5 did not 
lead to satisfactory results. It was not possible, with significant computational 
resources, to find a non-linear characteristic for the given setting. There are 
two main reasons for this difficulty. The first problem is caused by having two 
modular additions (separated by a rotation operation) within one state update. 
Fig. EUshows the iterative step function of MD5 with variables Ai and Bi. Hence, 
two different carry expansions may occur and by guessing only bits of the state 
Ai, conditions propagate slowly and contradictions are detected at a very late 
stage. Table 01 shows an example with many free (“?’) bits in Bi due to guessing 
bits only on A t . 

The second problem are the reduced starting constraints with only a few bit 
differences set in the chaining input. In the case of the type 3w collision search, 
there are no input difference in the message and only very few differences in 
the chaining input and at the chaining output. By guessing even more zero- 
differences (‘-’-bits), the found characteristics tend to get very sparse. In fact, 
these sparse characteristics are impossible, which is not detected early enough by 
the path search algorithm. Hence, most of the time is spent with paths whose im- 
possibility should be detected earlier. An example for a sparse (in state variable 
Ai), but impossible characteristic is given in Table 01 

To avoid these problems, the new MD5 path search strategy works as follows: 

1 . The starting point are only a small number of constraints (the chaining input 
difference, no message difference and the MSB path) for the characteristic. 

2. Instead of just picking bits of Ai, randomly pick non-restricted bits of the 
state Bi as well. 

3. Immediately guess the sign of any unrestricted difference (‘x’-bits), as soon 
as it occurs and do a backtracking if the guess leads to a contradiction. 

4. If all ‘x’-bits have been determined, continue with randomly guessing zero- 
differences until the next ‘x’-bit occurs. 
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Table 3. A sparse but impossible characteristic due to guessing too many zero- 
differences in Ai. Further, conditions do not quickly propagate into Bi and contra- 
dictions are detected at a very late stage. 



Whenever a contradiction occurs, a simple backtracking strategy (depth first 
search) is applied. Using this improved strategy, global contradictions (impossi- 
ble characteristics) are found at an earlier stage and impossible paths are less 
likely. The disadvantage of this strategy is that long carry expansions are more 
likely to occur and the resulting characteristic are less sparse. However, since we 
apply the path search mostly in the first round of MD5, even a high number of 
conditions can be fulfilled using simple message modification techniques EQ 

6 Practical Realization and Results 

We now describe implementations of several parts of the attack. This illustrates 
and details the method, and also serves are a validity check of the attack. To 
recapitulate our earlier description, the practical implementation of a type 3 
collision is divided into three steps: 

— Preparatory phase. Many special paths are searched and put on a heap. 

— Birthday phase. Looking through possible pairs of prefixes, a pair needs 
to be found that matches one of the paths on the heap. 

— Differential attack phase. Search for a conforming message pair using one 
of the characteristics generated earlier. 

An optimization that is important in practice, is as follows. Starting form the 
MSB path in the middle of the MD5 compression function, it suffices to compute 
many paths through the last round (ONX part). The last steps of this path will 
impose conditions (of type ’n’ and ’u’) on the chaining input. This information 
is enough for the birthday phase. The result of the birthday phase is a prefix 
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pair that is compatible with a particular path on the heap. It remains to finish 
the characteristic, the IF part, to connect to the MSB part in the middle (see 
Fig. @]for an illustration of the different parts). Having to deal with an actual 
chaining input pair in this phase of the attack imposes more constraints on the 
path search. However, as we detail in Section lb. II and also illustrate with the 
characteristic in the table in Appendix 0 these constraints can be dealt with in 
practice and do not impose any limitation on the attack. 

6.1 Runtime for IF Path Search 

In experiments involving the equivalent of about 2000 hours on a single core, we 
have verified the average runtime to find a single IF path is about 36 hours on a 
single core, which is about 2 17 seconds in which about 2 38 MD5 computation^ 
could be done. For these experiments, we not only generated paths for a partic- 
ular starting point, as the choice of a particular starting point has unpredictable 
consequences for a particular heuristic (this was also observed in j0|)- Instead 
we generated many (about 30) starting points (i.e. different sets of conditions 
on the chaining input) in a random way to derive meaningful average runtime 
estimates. This suggests that, using the proposed strategy, we can expect to find 
a path for every set of constraints, albeit with somewhat varying runtime. In 
turn, this allows us to estimate the workfactor for a type 3w collision attack on 
MD5. 

We found that the runtime for the search for IF-path does not depend on the 
number of differences in the CVfl The generation of the corresponding IF-paths 
can be delayed until after the birthday phase, contributes to the final search 
complexity only in an additive way, and is hence negligible. 

6.2 A Type 3 Collision Attack Based on Actually Generated Paths 

For the practical generation of type 3w collision attacks on the compression 
function of MD5, that in turn lead to a type 3 collision attack on the MD5 
hash, we constrain ourselves to differential paths which result in runtimes for 
finding a colliding message pair below 2 58 . For the preparatory step, it suffices 
to generate useful ONX paths. An ONX path is useful if it has a high probability, as 
the probability of a collision characteristic in the last round affects the resulting 
effort for finding a conforming message pair in a direct way. In order to give 
a bound on the allowable probability for the ONX path, we argue as follows. 
Among the four rounds (consisting of 16 steps each) the first round can easily 
be dealt with via simple message modification. The second round is an MSB-path 
and contains 16 conditions (the Boolean function needs to behave as expected 
at every step once, see also HJ), the third round contains no conditions as the 
Boolean function is an XOR, and the fourth round contains the more complex 
ONX-path. Improvements upon the original type 1 collision attack on MD5 by 

1 Each of our 2.0 GHz AMD Opteron(tm) cores performs about 2 21 MD5 computations 

per second using OpenSSL 0.9. 8g. 

2 We tested a range between 1 and 20. 
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Wang et al. concentrated on fulfilling more conditions in round 2. In a work from 
2005 j2S| , 14 conditions could already be fulfilled. Subsequent work by Klima [0| 
and Stevens et al. m significantly improved upon this. Conservatively assuming 
to be able to only fulfill 14 conditions suggests that round number four should 
not have more than 58 — 16 + 14 = 56 conditions. In Section 16.41 we give several 
reasons why this is a very conservative assumption. 

Another important parameter of ONX paths is the number and position of 
differences it has in the last four steps, as this determines (except for carries via 
the feed-forward operation) the uniqueness of the set of allowed pairs of chaining 
inputs that can be canceled. 

Inhere, we report on empirical findings using an actual implementation of 
parts of the attack. In total we spent an equivalent of about 15000 hours on 
a single core. The number of distinct paths for type 3w compression function 
attacks on MD5 we found together with their number of conditions on the IV is 
as follows: 


&r of conditions on IV||1|2| 3 I 4 I 5 I 6 I 7 I 8 I 9 QO 

1 number of paths 1 1 0 1 0 1 10 1 130 1 1216 1 6556 1 21523 1 49293 1 871 16 1 127018 1 

Not all found paths may be of use. Let Pi be the number of distinct paths 
with i conditions on the IV, we want to find a j such that (Xa=i ’Pi) ~ is 
maximal. Using the actually generated paths as described above, we found about 
217.34 p^hs with distinct constraints (with at most 9 relevant conditions) on the 
chaining input. Including also all found paths with 10 conditions would only 
improves the attack only if more than 2 17 ' 34 paths would be added, which is not 
the case. 

Using the notation of Section El this means x=17.3, w < 58, and z < 8. Based 
on this, a type 3 collision has a runtime of 2( 128 - 17 - 34 + 9 )/ * 2 (+2 58 ) = 2 60 19 , which 
is faster than the expected 2 64 for an ideal hash function of this size. Hence, 
MD5 offers a C 3 * * * security of no more than 60 bits. 

Note however, that in this calculation, there is a gross imbalance between time 
spent on generating paths (15000 CPU hours are about 2 47 MD5 computations) 
and the total runtime of the attack. Assuming to spend e.g. 2 7 times more 
computational resources in the path generation might well lead to an increase, 
from x = 17 to 24, which in turn would decrease the runtime of the overall type 

3 collision attack on MD5 to 2 57 , and would lead to an attack on the combiner 

MD5||SHA-1 with complexity less than 2 59 (assuming the type 1 collision attack 

on SHA-1 is fast enough). 

6.3 On Memory Requirements 

Both, the generic method due to Joux and the new approach using a type 3 
collision attack, can be implemented without requiring access to large memory. 
For both cases, this results in a runtime loss of about a factor n/2, hence the 
relativ advantage of the new approach over the generic method remains. Memory 
requirements of the attack (birthday phase and differential shortcut phase) are 
as follows. 
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Birthday phase. A naive implementation of the birthday phase would require 
a table of size 2 *)/ 2 in order to generate enough pairs to find a match with 

one of the 2 X paths. However, distinguished point methods may be used on a 
truncated version of the output of the compression function^ 

Let t be the size of the subset of bits that is needed to represent all 2 X 
paths. A lower bound for t is 2x/3, since every bit that is truncated leaves three 
possibilities for a path (’n’, ’u’, or In practice, t is higher. A memory-less 
method will find a partially suitable pair in time which would need to be 

repeated 2^~ x+z ^ times if done independently (and hence impose the additional 
condition x — z > t/2 on the attack to be more efficient than a generic attack). 

However, as described in j21!22j , the distinguished points method can be used 
to take advantage of the birthday effect also for generating more collisions (or 
suitable pairs), by keeping the entries in the list of each of the distinguished 
points. A parallelizable version with linear speed gain is described in j2H|. Hence 
the search needs to be repeated only 2 ( ' t ~ x+z ^ 2 times. As a result, a “memory- 
less” version of the birthday phase for the dedicated combiner attack behaves 
to a large extend as a “memoryless” version of a generic birthday attack. What 
is needed is memory to store 2 Z candidate pairs which are the outcome of the 
birthday phase. In all practical settings, z is small. 

Differential shortcut phase. Storing the precomputed paths for the shortcut 
attacks: in the order of a kilobyte per path. For practical values of x between 
10 and 20, storage costs are negligible and access to this memory is only needed 
once. 

6.4 On Conservative Estimates 

There are several reasons our estimates can be considered to be very conservative: 

— Basing assumption on speed-up methods (message modification, tunnels) is 
very conservative for the following reason. The lack of message differences, 
and the very simple MSB path in round 2 gives more freedom to apply speed- 
up methods as is the case in type 1 collision search attacks in earlier work. 

— Also, early stop methods which further speed-up collision search are not 
considered. 

— Runtime of various path search scenarios are measurements of actual imple- 
mentations, whose runtime may be optimized by some constant factor. 

— For our calculations, we use the highest possible allowed value for w (worst 
case). The expected value is in fact lower. 

7 Conclusions and Open Problems 

We proposed a new attack that allows collision attacks on combiners with a 
runtime below the birthday-bound of the smaller compression function when 

3 We will use the term “memoryless” to refer to these techniques, although they do in 
fact require some memory, albeit much less than a naive table-based approach. 
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the smaller compression function is MD5, potentially reducing a collision attack 
on a combiner like MD5||SHA-1 for the first time. This also answers an open 
question by Joux posed in 2004. The cryptanalytic technique we proposed for 
this is a combination of a birthday-style attack and a differential inside-out tech- 
nique that uses different parts of a collision characteristic at different stages of 
an attack, both before and after a birthday phase. This technique may be of 
independent interest. Based on only the characteristics we generated in practi- 
cal experiments with limited computational resources, a collision attack on the 
combiner with MD5 would already be around 2 60 (if the “normal” collision at- 
tack on the other hash functions is fast enough), however we argued that such 
an estimate is very conservative for various reasons. 

This illustrates that the MD5 hash function can not meet the requirements 
of a “weak hash function” as informally defined in this paper. Various open 
questions arise from this work: In a vein similar to concatenated combiners, or the 
Zipper construction ca. is it possible to come up with other collision resistant 
constructions that can use MD5, even though our results can be interpreted as 
showing that MD5 is “weaker than weak”? Another open problem is related to 
the application of our new cryptanalytic method to hash function constructions 
that use two or more parallel streams, like RIPEMD-160 m, as well as several 
SHA-3 candidate^ So far it proved difficult to obtain results on RIPEMD-160, 
even for interesting reduced variants 1121 - 
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A Supplementary Material for Obtained Results 

A particular low-weight input chaining difference becomes the MSB-path in the 
course of 10 steps. The following table contains the full characteristics illustrating 
a candidates for a type 3w compression function attack. As a proof-of-concept, 
we provide a representative example of a conforming message pair in Table El 
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Table 4. A conforming message pair for the first 16 steps 
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Abstract. The search for SHA-3 is now well-underway and the 51 sub- 
missions accepted for the first round reflected a wide variety of design 
approaches. A significant number were built around Rijndael/ AES-based 
operations and, in some cases, the AES round function itself. Many of the 
design teams pointed to the forthcoming Intel AES instructions set, to 
appear on Westmere chips during 2010, when making a variety of perfor- 
mance claims. In this paper we study, for the first time, the likely impact 
of the new AES instructions set on all the SHA-3 candidates that might 
benefit. As well as distinguishing between those algorithms that are AES- 
based and those that might be described as AES-inspired, we have de- 
veloped optimised code for all the former. Since Westmere processors are 
not yet available, we have developed a novel software technique based on 
publicly available information that allows us to accurately emulate the 
performance of these algorithms on the currently available Nehalem pro- 
cessor. This gives us the most accurate insight to-date of the potential 
performance of SHA-3 candidates using the Intel AES instructions set. 


1 Introduction 

Intel has announced that a new AES instructions seiQ will be introduced in new 
processors such as Westmere and available early in 2010. These instructions will 
provide resistance to a range of software side-channel attacks j.'ll.'lDj and offer 
significant performance benefits for encryption and decryption using AES M- 
Simultaneously the NIST SHA-3 effort j2S| to establish a new cryptographic 
hash algorithm is well-underway and several teams of submitters have used AES- 
like transformations as a cryptographic building block. Several of these teams 
have explicitly expressed the assumption that their hashing algorithms could 
take advantage of AES-NI and thereby enjoy significant performance benefits. 
Since the Westmere processor is still unavailable, there have been no substantive 
efforts to assess the possible implications of this important issue. In this paper, 

1 Denoted AES-NI in this paper for “new instructions” . 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 162-078] 2009. 
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we provide the first quantitative analysis that estimates the likely impact of the 
Intel AES instructions set on SHA-3 candidates. 

The first step is to identify which SHA-3 candidates should be considered, and 
this is not as straightforward as it might appear. AES-NI can be used in different 
combinations to carry out different transformations, and so AES-NI might be 
used in many more ways than would naively be expected. As a result, there are 
submissions for which the variant that provides (say) 256-bit digests gains from 
AES-NI, while the same algorithm providing a 512-bit digest cannot. 

The second step is to develop a sound methodology for implementing the differ- 
ent algorithms, optimising them, and measuring their performance. Clearly this 
is a challenge when Westmere processors are unavailable. So we developed new 
techniques from publicly available information — in effect, uncovering the behavior 
of AES-NI — and this allowed us to emulate Westmere behavior on the publicly- 
available Nehalem chips. While this might appear to detract from the value of 
the performance figures we derive, the level of validation and confirmation that 
took place during this work makes us confident that our results are close to the 
Westmere reality. 

Our sole goal in this paper has been to compare the performance of SHA-3 
candidates when using AES-NI. To this end, we have set aside cryptanalytic 
discussions m and we have implemented and optimised all the algorithms that 
we believe might benefit from AES-NI. While the authors of this paper are 
independent (co-)submitters of two SHA-3 proposals, we have strived to be fair 
and consistent. In addition, all the code is publicly available via | 20 | and we 
welcome interested parties to download and improve upon it. When Westmere 
processors appear, the same samples can be used for real silicon running AES-NI. 

2 The Intel AES Instructions 

To start we provide a brief description of the Intel AES instructions, and com- 
plete details can be found in jl 311 4| . Intel’s AES instructions set consists of six 
instructions, four of which aesenc, aesenclast, aesdec, and aesdeclast are 
designed to support data encryption and decryption. The names of these instruc- 
tions are short for AES encryption (inner and last) round and AES decryption 
(inner and last) round, see Table El from Appendix A. These instructions have 
register /register and register/memory variants. 

There are two other instructions for the AES key expansion but they seem to 
be of little use to the SHA-3 submissions and are omitted from this paper. 

2.1 What Operations Can We Use AES-NI for? 

Clearly, AES instructions can be used whenever a SHA-3 proposal uses one 
of the internal or final AES encryption (or decryption) rounds. But they can 
be used more widely than this. For instance, calling aesdeclast and aesenc 
back-to-back, both with a zeroed second operand, is functionally equivalent to 
performing AES MixColumns on the first operand, see Appendix A. 


164 R. Benadjila et al. 


In fact if we use the pshufb instruction which shuffles bytes in a 128-bit 
word, see Appendix A, then we can isolate all of the AES-constituents using 
AES-NI P3J, namely: 

SubBytes , ShiftRows , MixColumns , 

InvSubBytes , InvShiftRows , InvMixColumns . 

To illustrate the versatility this gives us, we combine standard xmm instructions 
with AES-NI to perform encryption with Rijndael |S| operating on 256-bit blocks. 
The plaintext is stored in xmm, and xmm, , but AES-NI cannot be used directly 
since half the bytes of xmm, must be swapped with half the bytes of xmm, . However, 
this swap can be efficiently implemented using two pshf ub (1) to pack the bytes 
to-be-swapped into two 32-bit words, two pblendw (2) to swap the 32-bit words, 
and two pshufb (3) to re-order the bytes giving, in total, the following state 
permutation: 



After this, aesenc can be applied in parallel to xmm, and xmm, , thereby giving the 
appropriate ShiftRows for the large state, and Rijndael encryption on a larger 
state has been emulated. Techniques like these are important to us since it is 
possible that several SHA-3 candidates that do not use the complete AES round, 
or that use a larger state, might still benefit from AES-NI. 


2.2 The “In-Scope” SHA-3 Candidates 

Obviously SHA-3 candidates that use the AES round as a building block can 
benefit from using AES-NI. In addition, algorithms that use the AES S-box 
along with some byte shuffling with or without the AES MDS mixing matrix 
can benefit. One can also apply these operations to larger states, as we have seen 
for Rijndael with 256-bit blocks. The main problems in using AES-NI tend to 
arise when designs move away from the AES MDS matrix. Generally speaking, 
this dramatically limits any potential performance gain from AES-NI, partic- 
ularly since most optimised assembly implementations would incorporate the 
MDS matrix operation into table look-ups, potentially combined with other op- 
erations. AES-NI might however still be of interest to these designs, especially 
in thwarting some side-channel attacks. 

There are four submissions that directly, and transparently, use AES rounds 
for all hash output lengths. These are echo |2J, lane [TB|, SHAVITE-3 0, and 
vortex (231 ■ For these algorithms it is clear that we can directly use AES-NI. 
There are others that are clearly inspired by Rijndael-like techniques in their 
construction. These include cheetah (22|, fugue (EJ, GR0STL jEJ, lesam- 
nta m 1 LUX EZj , and twister [TT| ■ The submission shamata 0 has already 
been withdrawn, and while some other surveys 0 describe sarmal EQ as being 
AES-inspired, a non- AES S-box and MDS mixing layer take it out-of-scope. 
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Table 1. The SHA-3 submissions with substantial Rijndael-based components. Check- 
marks indicate those that might benefit from AES-NI, for different hash output lengths. 


Algorithm 

224- bit 

256-bit 

384-bit 

512-bit 

ARIRANG 

/ 

/ 

no 

no 

CHEETAH 

/ 

/ 

no 

no 

ECHO 

/ 

/ 

/ 

/ 

FUGUE 

no 

no 

no 

no 

GR0STL 

no 

no 

no 

no 

LANE 

/ 

/ 

/ 

/ 

LESAMNTA 

/ 

/ 

/ 

/ 

LUX 

/ 

/ 

no 

no 

SHAVITE-3 

■/ 

/ 

/ 

/ 

VORTEX 

/ 

/ 

/ 

/ 


While lesamnta offers advantages for 256- and 512-bit hash outputs, it is 
interesting that only the 256-bit versions of CHEETAH and LUX benefit from 
AES-NI. By contrast, it appears that no variant of fugue, GR0STL, or twister 
are likely to benefit. These algorithms use a very different MDS mixing matrix 
to the AES and, as a result, end-up being too distant to use AES-NI in any 
efficient way. So even though a combination of AES-NI instructions could be 
used to isolate the S-box operations for fugue and GR0STL, say, the table look- 
ups typically used for the MDS operations in current optimised implementations 
mean that there is no easy way for these algorithms to benefit from AES-NI. 

Finally, even though the submission ARIRANG jO] is quite different from the 
Rijndael-based constructions, it might potentially benefit from AES-NI. We have 
therefore included it in our considerations and Table 0 summarizes the (alpha- 
betically ordered) list of algorithms and hash output lengths that we consider. 


3 Implementation and Measurements 

Obviously the best way to get performance timings is to write the appropriate 
code, run it on a Westmere processor (the first with AES-NI), and measure the 
performance. However, since this processor is not yet available, we propose a 
new methodology that can be used to get an accurate emulation of AES-NI. We 
rely on the fact that Westmere (formerly Nehalem-C) and Nehalem processors 
share the same micro-architecture. This means that if we can find suitable in- 
structions patterns that behave exactly as AES-NI instructions, we will get very 
good estimates for the future performance of AES-based SHA-3 candidates on 
a Westmere processor, but using today’s Nehalem processor. 

Previously, a substitution instruction was proposed m for future processors. 
However this substitution does not exhibit the correct behaviour for Westmere 
and can give misleading results, see Section mi and Appendix B. Here we pro- 
vide a particularly accurate replacement instructions pattern for aesenc and we 
explain how to derive it from publicly available information only. 
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3.1 Replacement Instructions Pattern 

The first step is to understand the exact behavior of the AES-NI instructions at 
the micro-operation (pop) level0 in particular that of aesenc and aesenclast. 

An Intel code analyzer tool (IACA EH) is publicly available and gives the 
following information about aesenc (aesdec yields the same output): 



(In this trace, ‘DV’ stands for the divider pipe of port 0, ‘D’ for the data fetch 
pipe of ports 2 and 3. Additionally, an ‘X’ in the trace will be used to denote the 
possible ports a pop can be dispatched to.) 

This shows that aesenc consists of three pops, two of which are dispatched 
to a unit on port 0 and one which is dispatched to a unit on port 5, and that 
the instruction’s latency is 6 cycles. However, this information is too coarse to 
provide hints for the right instructions pattern replacement: we need to derive 
the exact scheduling of these pops. In what follows, we represent pops by bars 
for which the length varies according to their latency. The gray bars denote the 
pops on port 5 while the white ones denote the pops on port 0. Hence — is a 
2 cycle pop on port 5 and «=«==* is a 3 cycle pop on port 0. 

Prom Intel’s white paper IS! we know that AES-NI are highly parallelizable. 
This discards the sequential pop patterns on port 0. Moreover, the white paper 
explains (see Fig. 9 and 15) that aesdec is structured using the equivalent inverse 
cipher (described in Appendix B), which is confirmed by an IACA trace identical 
to that of aesenc displayed above (see Appendix B). This leads us to assume 
that the pop on port 5 is the exclusive-or with the key, which is corroborated by 
the purpose of unit 5, see H2|. Therefore, the pop on port 5 runs in cycle 6 and 
requires that pops from port 0 are finished. 

Intel’s optimization reference manual [HU gives additional information on the 
possible pop latencies and throughput for each port on the Nehalem micro- 
architecture. In particular, we see that pops dispatched on port 0 can only have 
latencies 1, 4, or 5 cycles, and that pops on port 5 all have a 1 cycle latency. 
Since aesenc has a total latency of 6 cycles, this only leaves the following possible 
patterns: 1 1 1 n -gzB * , attd «= ' , " . (Two pops cannot start at the same 

cycle in the same unit but a pop is started as soon as possible to maximize the 
overall throughput). It is impossible that a pop on port 0 performs the SubBytes 
and/or ShiftRows step while it runs in parallel with the other pop performing 
the MixColumns step which would then need the output of the first pop. So both 
pops on port 0 perform at least one of the four MixColumn multiplications of the 
MixColumns step. The most natural way of doing this is to symmetrically split 
the computation on two independent halves of the state. In this case, the two 
pops on port 0 have the same latency, which only leaves the Rjjgjifc," pattern. 
This is again supported by the IACA trace of aesimc instruction, as well as the 
choice of inverse equivalent cipher for aesdec. 

2 Instructions are split into micro-operations and dispatched to specialized CPU units. 
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Now we turn to the replacement instructions set which would give exactly the 
same /.top-behavior as the instruction aesenc reg, reg. A previously proposed 
replacement |2.' >1 1 7| is not appropriate for Westmere (see Appendix B). Instead, 
a sequence that closely simulates the /top behavior of aesenc xmm; , xmm; is: 
movdqu xmm*; , xmm; 

For now, let us ignore the movdqu instruction. The IACA trace displayed below 
shows that the last three instructions of the replacement behave exactly as the 
aesenc xmmO, xmml instruction with a latency of 6 cycles. It yields two identical 
and independent //ops (they both come from mulps) on port 0, a 1 cycle //op on 
port 5 which is forced to start after the two /tops on port 0 since xorps has a 
1 cycle /x op on port 0 together with a dependency on register xmm2: 



The reader might wonder why we added the movdqu instruction to the beginning 
of the replacement: by introducing a dependency on xmmO, we try to prevent the 
processor from re-ordering the instructions at the prefetch and re-order step. 
Hence, movdqu acts as a fence and ensures that the replacement fragment exhibits 
a similar atomic behavior as aesenc. Since movdqu only has a latency of one cycle 
and can be dispatched on port 0, 1, or 5, it will in most cases execute on port 1 
in parallel of the other //ops — and does not interfere with the replacement, and 
rarely on port 5 or 0 which would add one cycle to the replacement latency. 

Note however, that though the replacement allows for a very good simulation 
of aesenc in terms of latency, throughput, and port behavior, it does introduce 
a significant issue: the use of a third register xmmt (fc = 2 in IACA’s trace) might 
interfere with code surrounding the replacement by introducing false dependen- 
cies. We took extra care in our implementations to avoid these when using the 
replacement. This was not an easy task, especially for those SHA-3 candidates 
that make heavy use of AES-NI parallelism such as echo and lane. 

Another potential issue is that the aesenc instruction is 5 to 10 bytes long 
depending on the variant whereas our replacement is 13 to 22 bytes. This can lead 
to an efficiency penalty as the prefetch buffer of the Nehalem micro-architecture 
has a size of 16 bytes. However an experiment (see Appendix B) shows that the 
size of replacement is unlikely to be a significant factor. 

Finally, we refer the reader to Appendix B for a justification of our choice of 
the following replacement for memory-based variants like aesenc xmm;, [mem] : 
movdqu xmmfc , xmmi 
mulps xmm; , [mem] 
mulps xmm*: , xmm; 

as well as for a discussion regarding replacements for other AES-NI instructions. 
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3.2 Timing Methodology 

For each in-scope candidate and for each hash output length, we implemented 
two versions of the submission. These were identical in every way, except one 
had AES-NI instructions and was used to ensure the correctness of our AES-NI 
optimized implementation against the NIST-submitted test vectors with Intel’s 
Emulator ^EH; the other had AES-NI instructions substituted with their replace- 
ments allowing it to run on a Nehalem to derive performance estimates. 

To get consistent results over the candidates, we measured the number of 
cycles (using rdtsc instructions and averaging over more than 10 8 samples to get 
stable results) taken by the compression function of each algorithm on the same 
Nehalem machine running Linux. However NIST’s API was fully implemented 
to check correctness and, in many cases, these were taken from the reference 
code sent to NIST by the submitters. To eliminate as much noise as possible 
from the OS, high priority scheduling was allocated to the measured code. All 
algorithms were implemented by the same programmers, providing a somewhat 
uniform level of optimization. 

4 Candidate Descriptions and AES-NI Implementations 

In this section we consider the design and discuss the implementation of the 
in-scope candidates. Pull details of the algorithms can be found in the respective 
algorithm descriptions, so we only give a brief overview of their functionality 
along with insights into their design with regards to AES-NI. Our implementa- 
tion proposals will be available from our website m- 

ARIRANG is a single-pipe compression function-based proposal. The bulk of 
the computation in the compression function consists of the 40-step expansion of 
a 512-bit message block, which is highly efficient in general purpose registers and 
can be pre-computed, and a StepFunction that is repeated 40 times. StepFunction 
requires eight exclusive-ors, four fixed rotations, and two calls to a function G 256 
that uses elements of the AES. For longer hash outputs, the equivalent function 
G 512 uses a larger MDS matrix that cannot be emulated using AES-NI, and so 
any potential gain is restricted to 256-bit outputs. 

However, the extent of this gain is very limited since ARIRANG uses \ of 
an AES round as a building block, but the latency cost of aesenc while only 
performing ^ of an AES round means that the performance of AES-NI, when 
compared to the use of lookup tables, is not competitive. Attempts to parallelize 
two of the \ AES rounds introduced too many overheads. We conclude that 
AES-NI is unlikely to offer any substantial benefits to arirang. 

CHEETAH is a single-pipe compression function-based proposal. The com- 
pression function consists of two strands of computation: a message-dependent 
EXPANDED BLOCK is generated which provides a key-like input to encrypt the 
internal state. While the computations on expanded block and internal 
state are both Rijndael-inspired, the former uses a different non-AES MDS 
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matrix that is hard to emulate. Thus this key derivation is unlikely to benefit 
from AES-NI and the use of look-up tables seems better suited. 

For operations on the INTERNAL STATE, the 224- and 256-bit versions of CHEE- 
TAH use an operation InternalRound that can be emulated using AES-NI. How- 
ever, the inherent sequential nature of the rounds and the fact that AES-NI 
cannot be used in the most straightforward way means that while there are 
gains, they are not as significant as they might be for some other submissions. 

For the 384- and 512-bit versions, the operation InternalRound is modified to 
use a larger MDS matrix that, once again, cannot exploit AES-NI. So for these 
larger outputs, there is unlikely to be any gain with AES-NI. 

ECHO is a double-pipe compression-based hash function. The 224- and 256-bit 
(resp. 384- and 512-bit) versions encrypt a sixteen 128-bit words state in eight 
(resp. ten) rounds of a compression function calculation. The encryption round 
applies two AES rounds to each word of the state with a counter or salt as a 
key, followed by a BIG.MixColumns MDS and row shift operation that provides 
mixing across the entire state. For all hash output lengths, echo can benefit 
from AES-NI and, while echo is primarily a double-pipe compression-based 
hash function, a simple single-pipe variant was announced at the first NIST 
workshop. We therefore include it in our considerations. 

The AES encryption rounds are directly performed with aesenc with pre- 
computed keys in memory. This allows the algorithm to take full advantage of 
the AES-NI parallelism. The BIG.MixColumns operation however cannot further 
benefit from AES-NI, though it is based on MixColumns. As an echo encryption 
round does not vary with the output length, the same optimizations apply. 

LANE is a single-pipe compression function-based hash function. Compress 
consists of a message expansion, a set of six p-permutations, and then a set 
of two q-permutations. As both sets of permutations are based on the AES 
round, lane benefits from AES-NI at all hash function output lengths. 

Both permutations are made of L = 2 (resp. L = 4) lines of AES rounds for 
hash outputs of 256 (resp. 512) bits and after each round of AES in each line, 
an operation SwapColumns mixes the L computation strands, lane therefore 
offers two levels of parallelism: the P- and q-permutations and the lines inside 
the permutations. The latter does not allow to take full advantage of AES- 
NI parallelism as SwapColumns breaks the instructions flow so we use the two 
levels of parallelism simultaneously: we compute an AES round for each of the 
6 L lines of the p-permutations in parallel before applying SwapColumns in 
each p-permutation, and do the same for the q-permutations. (The code is 
completely unrolled and all keys are precomputed.) 

For 256-bit outputs, the state nicely fits the available xmm registers. But for 
512-bit outputs, the state does not fit anymore and only three P-PERMUTATIONS 
are computed in parallel instead of all six as before. This, in itself, does not 
change the AES-NI throughput as the number of lines is doubled in each per- 
mutation and thus the same number of AES rounds as before is performed 
in parallel. However, the 512-bit version of SwapColumns imposes an additional 
overhead. 
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LESAMNTA is a single-pipe compression function-based hash function. The 
underlying block cipher has the general topology of an unbalanced Feistel cipher; 
at each round two strands of the eight that comprise the cipher state are updated 
using a message dependent “subkey” and the round function /256 (resp. / 512 ) 
for the 256-bit (resp. 512-bit) hash output. The subkey generation and the faze 
and /512 functions in the encryption path all involve AES-like operations and 
lesamnta can potentially benefit from AES-NI. 

For the 256-bit version, the key schedule poses few problems. However, one 
difficulty for encryption path is that the AES-like transformations operate on 
64-bit values and the MDS matrix is distinct from that of AES. The MDS matrix 
( 12 ) that is used is however a submatrix of MixColumn and so inserting zero 
bytes at the entry of the appropriate MixColumns entries will allow to perform the 
AES-like transformation using AES-NI. This can be achieved with the sequence: 
pshufb, pxor with a particular constant, aesenc, and pshufb. Note that in this 
case, aesenc is used at i of its normal efficiency. 

In the case of 512-bit hash outputs, the AES-like transformation in the key 
schedule involves an MDS that is too different from MixColumns, and so AES-NI 
is not really of any use there: the keys are therefore precomputed in a classical 
way. However, on the encryption side the round functions now use the full AES 
round, which gives nice advantages. 

For both sets of outputs, it is possible to use the unbalanced nature of the 
Feistel construction to perform four / functions in parallel for both output sizes. 
In the 256-bit version, this carries a greater benefit: the four instances of the 
sequence preparing the data mentioned above can also be grouped to increase 
the overall throughput. 

LUX is a stream-cipher based hash function that uses two banks of cipher state; 
the buffer and the core. At each iteration a block of message is input to both 
the BUFFER and CORE, both of which are then updated with information being 
passed between them. Sixteen blank rounds of computation seal the hashing 
process after the last block of message has been processed. While the buffer 
transformation is very simple, the CORE transformation is built on Rijndael-like 
operations. And it is the Rijndael-like operations in the CORE that are the most 
time-consuming parts of LUX, with mixing of the CORE and BUFFER requiring 
only a few, simple xmm instructions. 

For all hash output lengths, the CORE transformation operates on a larger 
state than we find in the AES. However for 256-bit hash outputs it is equivalent 
to Rijndael operating on 256-bit blocks and techniques described in Section HI 
can be used. Thus lux with 256-bit outputs will benefit from AES-NI. 

When used to generate longer hash outputs, however, lux changes the form 
of the MDS transformation in such a way that it cannot easily be emulated 
using AES-NI. It appears for these longer outputs that AES-NI will not offer 
any advantage. In fairness, the optimised implementations of LUX for 512-bit 
outputs are already extremely competitive. 

As an aside on the timing methodology, it is worth observing that we imple- 
mented sixteen iterations of the classical compression function found in LUX as a 
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single compress operation. This avoided buffer rotations and helped treat lux 
in a way that was more consistent with the other algorithms. 

SHAVITE-3 is a single-pipe compression function-based design, with the com- 
pression function being built closely on a Feistel cipher. The round function for 
this Feistel cipher is built directly from an AES round, and the accompanying 
message expansion also uses the AES round function. As a result, all hash output 
sizes can expect to benefit from AES-NI. 

For the 256-bit hash output, the round function for the 12-round Feistel cipher 
consists of three rounds of the AES and we can therefore use AES-NI directly. To 
avoid any interaction with the memory, it is much more efficient to perform key 
derivation inside the xmm registers. Key derivation produces 36 subkeys of 128 bits 
using a combination of a non-linear layer based on four aesenc operations and 
a linear layer. It is possible to interleave key derivation with encryption since 
there are sufficient registers. The linear part of the key derivation only requires 
a few xmm manipulations (if handled properly) while the four AES rounds in the 
key schedule can be performed in parallel. The Feistel round function involves 
three AES rounds, but this time they are chained, shavite-3 derives a significant 
benefit from avoiding memory access. 

For the 512-bit hash output, the underlying 14 rounds block cipher is a gen- 
eralised Feistel network. At each round there are two parallel invocations of four 
AES rounds. Now, however, key derivation produces new 128-bit words in sets of 
eight, rather than four, and so this needs to be performed in place while keeping 
the rest of the state in registers. The linear part of key derivation can still be im- 
plemented efficiently and the eight AES rounds can be parallelized. Within the 
encryption operation, there are now two Feistel round functions, each with four 
dependent AES rounds but these can be interleaved, increasing the throughput 
slightly. SHAVITE-3 is very closely built around the AES round operation and 
gains substantially from AES-NI. 

VORTEX is a single-pipe compression function-based design that uses the en- 
veloped Merkle-Damgard construction and builds upon MDC-2 0 . The building 
blocks of vortex are Rijndael rounds on 128-bit blocks for vortex-256 and Ri- 
jndael rounds on 256-bit blocks for VORTEX-512. Cross-mixing between the 128- 
bit strands (resp. 256-bit strands for vortex 512) is multiplication-based. The 
parameter Mt determines whether integer multiplication (Mr = 1) or carry-less 
multiplication (Mt = 0) is used. A motivation behind vortex was to directly 
exploit AES-NI and the carry-less multiplication instructions on future Intel pro- 
cessors. In this paper we consider the case of Mt = 1. For vortex with 256-bit 
outputs we can directly exploit the aesenc operation. The key schedule calls 
upon the AES S-box but this can be easily emulated. For the 512-bit outputs, 
the underlying cipher operates on 256-bit states and, using similar techniques 
to those described in Section 12.11 it is straightforward to operate on this larger 
state. In contrast to some other algorithms, e.g. echo and lane, vortex fits 
into the registers. On the other hand, it turns out that there is a bit less room 
to exploit AES-NI parallelism. 
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5 Implementation Results 

Performance estimates for all SHA-3 candidates considered in this paper are 
given in Table 0 The Nehalem measurements were made on a Core i7 920 pro- 
cessoiQ clocked at 2.67 GHz with GNU/Linux Debian running a 2.6.26-l-amd64 
kernel. The compiler was icc for amd64, Version 11.0, Build 20081105. 
As explained in Section 0 we believe that these results will be very close to the 
real performance of the algorithms when run on the Westmere processor. For 
reference, some performance figures using assembly code from OpenSSL for 
SHA-256 and SHA-512 timed under the same methodology on the same proces- 
sor are 18.6 and 12.0 cycles/Byte respectively. While our results are preliminary, 
we feel they are sound enough to make some general observations. 


Table 2. The predicted Westmere performance in cycles/Byte for those algorithms 
that can benefit from the Intel AES instructions set. For illustration, we provide the 
optimised performance figures given by submitters at the first NIST SHA-3 workshop. 
Other performance data can be found at 0 . Since in all cases 224- and 384-bit outputs 
are obtained by truncating 256- and 512-bit outputs, we only give figures for the latter. 



256-bit 

512-bit 

Algorithm 

AES-NI 

previous 

AES-NI 

previous 

ARIRANG 

14.9 

14.9 

- 

11.3 

CHEETAH 

7.6 

9.3 

- 

13.6 

echo ( double-pipe ) 

6.6 

28.5 

12.3 

53.5 

echo-SP ( single-pipe ) 

5.7 

24.4 

8.1 

35.7 

LANE 

5.5 

25.7 

13.9 

145.0 

LESAMNTA 

30.8 

52.7 

19.9 

51.2 

LUX 

6.6 

10.2 

- 

9.5 

SHAVITE-3 

5.6 

26.7 

5.5 

38.2 

VORTEX (Mr = 1) 

4.4 

46.3 

5.2 

56.1 


While it is tempting to group all AES/Rijndael-based SHA-3 submissions 
together 0, one significant point of difference is that some will not be able to 
take advantage of AES-NI. Further, there are some algorithms, e.g. cheetah and 
lux, for which the shorter hash outputs are likely to gain from AES-NI while the 
longer hash outputs, i.e. 384 and 512-bit, won’t. Interestingly, CHEETAH is one 
of the fastest AES-inspired SHA-3 submissions on the NIST reference platform. 
But its performance when used with AES-NI is somewhat constrained by other 
non- AES components and cheetah may be slightly less competitive than the 
other algorithms when using AES-NI. That said, currently optimised code for 
this algorithm is reasonably efficient anyway. Our results for LESAMNTA differ 
from those at ca which unfortunately use a different, inappropriate replacement 
instruction (see Section mi and Appendix B). 

3 Note that to ensure stable and clean results, we disabled two features of the processor: 

Hyperthreading and Turbo Boost. 
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Table 3. For those algorithms that solely use the AES round in its entirety, we give 
the number of AES rounds/Byte as a crude measure of how much the AES is used 
during the hashing process. We also give the cost, which is computed as the number of 
cycles/AES round. In general terms, the lower the cost, the more efficiently the AES 
round is being used with respect to AES-NI. 



256-bit 

512-bit 

Algorithm, 

AES-NI 

#AES/Byte 

cost 

AES-NI 

#AES/Byte 

cost 

echo ( double-pipe ) 

6.6 

1.33 

4.96 

12.3 

2.50 

4.92 

echo-SP ( single-pipe ) 

5.7 

1.14 

5.00 

8.1 

1.67 

4.85 

LANE 

5.5 

1.31 

4.20 

14.3 

1.75 

8.17 

SHAVITE-3 

5.6 

0.81 

6.91 

5.5 

1.31 

4.20 

VORTEX (Mr — 1) 

4.4 

0.72 

6.11 

5.2 

0.72 

7.22 


As would be expected, algorithms that are specifically designed around the 
AES round operation — echo, lane, shavite-3, and vortex — have the most 
to gain by appealing to AES-NI. If we consider the figures for 256-bit hash 
outputs then, for single-pipe variants, the throughput performance of these four 
algorithms is similar. However there is a much greater contrast in performance 
when we turn to 512-bit hash outputs, and this is due to differences in design. For 
instance, shavite-3 for 512-bit outputs gains substantially from AES-NI since 
the modified round function for 512-bit outputs offers many opportunities for 
parallelism. This is something that is especially suited to AES-NI. On the other 
hand, when we move from 256- to 512-bit outputs with LANE, while the number 
of AES operations per byte increases in roughly the same proportion as was the 
case for shavite-3, there is a performance impact that comes from doubling the 
size of the lanes in the P- and q-permutations. Of course, when compared to 
existing optimised implementations LANE will still gain considerably when using 
AES-NI. But it does demonstrate how different design decisions can lead to very 
different performance profiles. 

6 Conclusions 

In this paper we have provided the first in-depth analysis of the likely impact 
of Intel’s AES instructions set on the first round SHA-3 candidates. To do this 
we designed a new methodology to replicate and anticipate the likely behavior 
of AES-NI in Westmere and we feel that this, in itself, will be of considerable 
interest. We have also provided the first performance estimates for those submis- 
sions that are likely to gain from AES-NI. Throughout we have tried to make a 
consistent and comprehensive comparison, and we have used the best currently- 
available information. We believe that our predictions are accurate and, in fact, 
may even be conservative. All the code we have developed is public m and this 
will allow others to develop their own optimized versions and to obtain improved 
performance projections. 
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Finally this paper sheds light on what has, until now, been a somewhat hidden 
issue. It is clear that the new Intel AES instructions set will have a profound 
effect on the performance of some of the SHA-3 submissions. At the same time, 
this low-level support for AES will become very widespread within a few years. 
Certainly this is only one factor among many for the SHA-3 candidates; but it 
may well be one of the important ones. 
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Appendix A: Instructions 

Table 4. The instructions that provide AES encryption 



fs(Tmp) ; 

» (Tmp) ; 
Round Key 


Table 5. How to derive the MixColumns operation from AES-NI 



Tmp := InvShiftRows (Tmp); 
Tmp := InvSubBytes (Tmp); 
mml := Tmp xor 0x0; 

Tmp := ShiftRows (Tmp); 
Tmp := SubBytes (Tmp); 

Tmp := MixColumns (Tmp); 
xmml := Tmp xor 0x0; 


Description of Some Additional Operations Used in This Work 

pshuf b xmml , xmm2/ml28 This instruction is used to generate a byte-wise per- 
mutation of the contents of the first 128-bit operand, where the permutation is 
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defined by the second operand (xmm register or a memory location). The sec- 
ond source operand (xnun2/ml28) is used as a mask, as follows. For each byte of 
xmm2/ml28, the least significant four bits specify from where to select the corre- 
sponding byte of the source operand (xmml). In addition, if the most significant 
bit of a byte of xmm2/ml28 equals one, then, regardless of the values of the other 
bits in that byte, zero is written in the result byte. 

pblendw xmml, xmm2/ml28, imm8 This operation “blends” the contents of 
two 128-bit operands (two registers or a register and a memory location) at the 
granularity of 16-bit words. Words from the second operand are conditionally 
written to the destination operand, depending on the setting of bits in the byte 
operand imm8. If bit k of this byte is set, then word k of the source is copied to 
the destination. If bit k is zero, word k of the destination is unchanged. 

Appendix B: Rationale Behind the Replacements 

Additional IACA Traces 

AES-NI provides the aesimc instruction to perform InvMixColumns: 



The IACA tool supports the aesdec instruction the trace of which is shown 
below but does not support the aesdeclast instructions. From what has been 
derived for aesenc, aesdec, and aesimc, it is reasonable to assume its trace 
would have been identical to that of aesdec. 


Total Latency: 6 Cycles; Total number of Uops: 3 





Instructions Replacement Size 

In order to evaluate the possible impact on the prefetching step (the prefetch 
buffer has a size of 16 bytes) or on the instruction cache, we conducted the 
following experiment: we went through the same kind of analysis as we conducted 
on aesenc and we replaced pmulld xmml5 , [mem] which has two sequential flops 
of 3 cycles on port 1 by 

phminposuw xmml5, [mem] 
phminposuw xmml5, xmml5 

which have a single fi op on port 1 each, but are interdependent. While the size of 
pmulld is 7 bytes and the size of the proposed replacement is 17 bytes, they both 
ran on the Nehalem with identical timings. Not only does this lend support to our 
approach, but it also suggests that the increased size of our AES-NI instructions 
set replacement is unlikely to have a significant effect. 
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Instructions Replacement for the Memory Variant 

The aesenc reg, [mem] replacement we propose is actually quite similar to the 
aesenc reg, reg one. The only difference lies in the simulation of the memory 
access: it shouldn’t impact the pop flows and, to accurately simulate aesenc 
reg , [mem] , the corresponding pop should start at the same cycle as the first 
pop on port 0. This is why we chose to launch the memory access at the first 
mulps instruction: 

movdqu xmmfc , xmmi 
mulps xmmi , [mem] 

xorps xmmi , xmm/; 

The validity of this replacement is assessed by the two following I AC A traces: 


Total Latency: 12 Cycles; Total number of Uops: 4 
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An unfortunate side-effect of this replacement is that it affects an additional xmm 
register, putting additional constraints when avoiding false dependencies. This 
mainly concerns the ECHO and LANE algorithms. 


Equivalent Inverse Cipher 

The equivalent inverse cipher |BJ allows for a decryption structure that is very 
similar to that of encryption. This is achieved by noticing that the straightfor- 
ward decryption algorithm 

InvShiftRows , InvSubBytes , AddRoundKey , InvMixColumns , 
can be replaced by the equivalent one 

InvSubBytes , InvShiftRows , InvMixColumns , AddRoundKey , 
as the two first rounds commute and the last two commute when the key expan- 
sion is tweaked accordingly; decryption is now similarly structured to encryption: 
SubBytes , ShiftRows , MixColumns , AddRoundKey . 


An Inappropriate Replacement 

In this paragraph, we give the IACA trace for the pmuludq instruction. This 
shows that the replacement proposed in m is not appropriate as a generic 
aesenc replacement on the Nehalem architecture. In the trace below, pmuludq 
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has a latency of 3 cycles whereas the aesenc instruction has a latency of 6 cycles, 
so the two instructions behave differently. It is even worse at the fj , op level, as 
aesenc has 3 pops dispatched through ports 0 and 5 whereas pmuludq has a 
single pop dispatched on port 1: this will lead to very distinct behaviors, and 
almost certainly a different throughput. 



I 1 | | 111 I I I I I I CP I pmuludq xmmO, xmml 

This explains the differences in the performance of LESAMNTA derived in this 
paper and quoted at m 
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Abstract. Group encryption (GE) schemes, introduced at Asiacrypt’07, 
are an encryption analogue of group signatures with a number of inter- 
esting applications. They allow a sender to encrypt a message (in the 
CCA2 security sense) for some member of a PKI group concealing that 
member’s identity (in a CCA2 security sense, as well); the sender is able 
to convince a verifier that, among other things, the ciphertext is valid 
and some anonymous certified group member will be able to decrypt the 
message. As in group signatures, an opening authority has the power of 
pinning down the receiver’s identity. The initial GE construction uses in- 
teractive proofs as part of the design (which can be made non-interactive 
using the random oracle model) and the design of a fully non-interactive 
group encryption system is still an open problem. In this paper, we give 
the first GE scheme, which is a pure encryption scheme in the standard 
model, i.e., a scheme where the ciphertext is a single message and proofs 
are non- interactive (and do not employ the random oracle heuristic). As 
a building block, we use a new public key certification scheme which 
incurs the smallest amount of interaction, as well. 

Keywords: Group encryption, anonymity, provable security. 

1 Introduction 

Group encryption (GE) schemes, introduced by Kiayias, Tsiounis and Yung |2H| . 
are the encryption analogue of group signatures dOj. The latter primitives ba- 
sically allow a group member to sign messages in the name of a group without 
revealing his identity. In a similar spirit, GE systems aim to hide the identity of 
a ciphertext’s recipient and still guarantee that he belongs to a population of 
registered members in a group administered by a group manager (GM). A sender 
can generate an anonymous encryption of some plaintext m intended for a re- 
ceiver holding a public key that was certified by the GM (message security and 
receiver anonymity being both in the CCA2 sense). The ciphertext is prepared 
while leaving an opening authority (OA) the ability to “open” the ciphertext 

* This author’s research was supported by the Belgian Walloon Region project 
ALAWN (Programme Wist 2). 

** This author acknowledges the Belgian National Fund for Scientific Research (F.R.S.- 
F.N.R.S.) for their financial support. 
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(analogously to the opening operation in group signatures) and uncover the re- 
ceiver’s name. At the same time, the sender should be able to convince a verifier 
that (1) the ciphertext is a valid encryption under the public key of some group 
member holding a valid certificate; (2) if necessary, the opening authority will 
be able to find out who the receiver is; (3) (optionally) the plaintext is a witness 
satisfying some public relation. 

Motivations. The GE primitive was motivated by various privacy applications 
such as anonymous trusted third parties or oblivious retriever storage. Many 
cryptographic protocols such as fair exchange, fair encryption or escrow encryp- 
tion, involve trusted third parties that remain offline most of the time and are 
only involved to resolve problems. Group encryption allows one to verifiably 
encrypt some message to such a trusted third party while hiding his identity 
among a set of possible trustees. For instance, a user can encrypt a key (e.g., in 
an “international key escrow system” ) to his own national trusted representative 
without letting the ciphertext reveal the latter’s identity, which could leak infor- 
mation on the user’s citizenship. At the same time, everyone can be convinced 
that the ciphertext is heading for an authorized trustee. 

Group encryption also finds applications in ubiquitous computing, where 
anonymous credentials must be transferred between peer devices belonging to 
the same group. Asynchronous transfers may require to involve an untrusted 
storage server to temporarily store encrypted credentials. In such a situation, 
GE schemes may be used to simultaneously guarantee that (1) the server retains 
properly encrypted valid credentials that it cannot read; (2) credentials have 
a legitimate anonymous retriever; (3) if necessary, an authority will be able to 
determine who the retriever is. 

By combining cascaded group encryptions using multiple trustees and accord- 
ing to a sequence of identity discoveries and transfers, one can also implement 
group signatures where signers can flexibly specify how a set of trustees should 
operate to open their signatures. 

Prior Works. Kiayias, Tsiounis and Yung (KTY) (2H| formalized the con- 
cept of group encryption and provided a suitable security modeling. They pre- 
sented a modular design of GE system and proved that, beyond zero-knowledge 
proofs, anonymous public key encryption schemes with CCA2 security, digital 
signatures, and equivocal commitments are necessary to realize the primitive. 
They also showed how to efficiently instantiate their general construction using 
Paillier’s cryptosystem j2S| (or, more precisely, a modification of the Camenisch- 
Shoup m variant of Paillier). While efficient, their scheme is not a single mes- 
sage encryption, since it requires the sender to interact with the verifier in a 
A-protocol to convince him that the aforementioned properties are satisfied. In- 
teraction can be removed using the Fiat-Shamir paradigm j2H| (and thus the 
random oracle model 0), but only heuristic arguments j22| (see also [Hj) are 
then possible in terms of security. 

Independently, Qin et al. (SOI considered a closely related primitive with non- 
interactive proofs and short ciphertexts. However, they avoid interaction by 
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explicitly employing a random oracle and also rely on strong interactive assump- 
tions. As we can see, none of these schemes is a truly non-interactive encryption 
scheme without the random oracle idealization. 

Our Contribution. As already noted in various contexts such as anonymous 
credentials j2j , rounds of interaction are expensive and even impossible at times 
as, in some applications, proofs should be verifiable by third parties that are 
not present when provers are available. In the setting of group encryption, this 
last concern is even more constraining as it requires the sender, who may be 
required to repeat proofs with many verifiers, to maintain a state and remember 
the random coins that he uses to encrypt every single ciphertext. In the frequent 
situation where many encryptions have to be generated using independent ran- 
dom coins, this becomes a definite bottleneck. 

This paper solves the above problems and describes the first realization of 
group encryption which is a fully non-interactive encryption scheme with CCA2- 
security and anonymity in the standard model. In our scheme, senders do not 
need to maintain a state: thanks to the Groth-Sahai m non-interactive proof 
systems, the proof of a ciphertext can be generated once-and-for-all at the same 
time as the ciphertext itself. Furthermore, using suitable parameters and for a 
comparable security level, we can also shorten ciphertexts by a factor of 2 in 
comparison with the KTY scheme. As far as communication goes, the size of 
proofs allows decreasing by more than 75% the number of transmitted bits be- 
tween the sender and the verifier. 

Since our goal is to avoid interaction, we also design a joining protocol (ie., a 
protocol whereby the user effectively becomes a group member and gets his pub- 
lic key certified by the GM) which requires the smallest amount of interaction: 
as in the Kiayias-Yung group signature m, only two messages have to be ex- 
changed between the GM and the user and the latter need not to prove anything 
about his public key. In particular, rewinding is not necessary in security proofs 
and the join protocol can be safely executed in a concurrent environment, when 
many users want to register at the same time. The join protocol uses a non- 
interactive public key certification scheme where discrete-logarithm-type public 
keys can be signed as if they were ordinary messages (and without knowing the 
matching private key) while leaving the ability to efficiently prove knowledge 
of the certificate/public key using the Groth-Sahai techniques. To certify users 
without having to rewincjj in security proofs, the KTY scheme uses groups of 
hidden order (and more precisely, Camenisch-Lysyanskaya signatures [T2|). In 
public order groups, to the best of our knowledge, our construction is the first 
certification method that does not require any form of proof of knowledge of 
private keys. We believe it to be of independent interest as it can be used to 
construct group signatures (in the standard model) where the joining mecha- 
nism tolerates concurrency in the model of j.'K )j without demanding more than 
two moves of interaction. 


Although the simulator does not need to rewind proofs of knowledge in [23 , 
still have to interactively prove the validity of their public key. 
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Organization. In section El we describe the intractability assumptions that 
we need and recall the KTY model of group encryption. Section 0 explains 
the building blocks of our construction and notably describes our certification 
scheme. Our GE system is depicted in section 0 

2 Background 

In the paper, when S' is a set, x *— S denotes the action of choosing x at random 
in S. By a S poly(A), we mean that a is a polynomial in A while b e negl(A) says 
that b is a negligible function of A. When a and b are two binary strings, a\\b 
stands for their concatenation. 

2.1 Complexity Assumptions 

We use groups (G, Gt) of prime order p with an efficiently computable map 
e:CxG-» Gt such that e(g a , h b ) = e(g, h) ab for any (g, h) € G X G, a, b € Z 
and e(g, h ) ^ 1 g t whenever g,h ^ 1 g- 

In this setting, we rely on an assumption introduced in 0 that allows con- 
structing efficient non-interactive proofs as pointed out in 1271- 

Definition 1. The Decision Linear Problem (DLIN) in G, is to distinguish 
the distribution D\ = {(g,g a ,g b ,g ac ,g bd ,g c+d )\a, b, c, d 4- Z*} from the distri- 
bution D 2 = {(g,g a ,g b ,g ac ,g bd ,g z )\a,b,c,d,z 4- Z*}. The Decision Linear 
Assumption is the intractability of DLIN for any PPT algorithm V. 

This problem amounts to deciding whether vectors g{ = (g a , 1, < 7 ), <72 = (1 ,g b ,g) 
and <73 are linearly dependent or not. We also consider a related computational 
problem which bears similarities with simultaneous pairing problems [2HI25] . 

Definition 2 . The Simultaneous Double Pairing problem (S2P) in G is, 
given (gi,g 2 ,gi,c, 92 ,d) e G 4 , to find a triple ( u,v,w ) e G 3 \{(1 g, 1g> 1g)} such 
that e(< 7 i, u) = e{gi iC ,w) and e(g 2 ,v) = e(g 2 , d ,w). 

Like the simultaneous triple pairing assumption ESI, the hardness of this prob- 
lem is implied by the DLIN assumption: given (g, g\, g 2 , g\, g d , f] = g c+d ) any 
algorithm that, on input of (< 7 i, <? 2 ; 9i, 92)1 outputs a non-trivial ( u,v,w ) such 
that e(gi,u) = e(g'{, w), e(g 2 ,v) = e(g 2 ,w) allows telling whether r/ = g c+d by 
testing if e(g, u ■ v) = e(r], w) (since u = w c and v = w d ). 

We also use the Hidden Strong Diffie-Hellman (HSDH) assumption introduced 
in m as a strengthening of the Strong Diffie-Hellman assumption 0 . 

Definition 3. The ^-Hidden Strong Diffie-Hellman problem (l-HSDH) in 
G is, given ( g , 12 = g u ,u) 4- G 3 and triples {g x P u+Si \ g Ci ,u Ci ) with ci, . . . ,C£ 4- 
Z*, to find another triple (^ 1 /( w + c ),g e , u c ) such that c ^ c* for i= 

We finally need the following variant of the Diffie-Hellman assumption. 
Definition 4. The Flexible Diffie-Hellman problem (FlexDH) is, given 
(<L 9 a i 9 b ) € G 3 , where a,b 4- Z*, to find a triple ( C , C a , C ab ) such that C 1g- 
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A potentially easier problem considered in P2! only requires to output (C, C ab ) 
on input of the same values. The latter problem was proved generically hard in 
prime order groups m- In bilinear groups, any algorithm solving either of these 
two problems would make it easy to recognize g abc on input of (g, g a , g b . g c ), 
which is a problem suggested for the first time in 0 Section 8] . 

2.2 Model and Security Notions 

Group encryption schemes involve a sender, a verifier, a group manager (GM) 
that manages the group of receivers and an opening authority (OA) that is 
able to uncover the identity of ciphertext receivers. A group encryption system 
is formally specified by the description of a relation 7Z as well as a collection 
GE = (SETUP, JOIN, (G r , TZ, sample^}, ENC, DEC, OPEN, (V,V)) of algorithms 
or protocols. Among these, SETUP is a set of initialization procedures that all 
take (explicitly or implicitly) a security parameter A as input. They can be split 
into one that generates a set of public parameters param (a common reference 
string), one for the GM and another one for the OA. We call them SETUPmit(A), 
SETUPcM(param) and SETUPoA(param), respectively. The latter two procedures 
are used to produce key pairs (pk GM , skcivi), (pkoA> skoA) for the GM and the OA. 
In the following, param is incorporated in the inputs of all algorithms although 
we sometimes omit to explicitly write it. 

JOIN = (J U ser, Jgm) is an interactive protocol between the GM and the prospec- 
tive user. As in m , we will restrict this protocol to have minimal interaction and 
consist of only two messages: the first one is the user’s public key pk sent by J user 
to Jgm and the latter’s response is a certificate cert p k for pk that makes the user’s 
group membership effective. We do not require the user to prove knowledge of his 
private key sk or anything else about it. In our construction, valid keys will be 
publicly recognizable and users do not need to prove their validity. After the exe- 
cution of JOIN, the GM stores the public key pkand its certificate cert p k in a public 
directory database. 

Algorithm sample allows sampling pairs (x,w) G 7Z (made of a public value 
x and a witness w) using keys (pk^, sk-^) produced by Q r . Depending on the 
relation, sk^ may be the empty string (as will be the case in our scheme). The 
testing procedure 1Z(x, w ) returns 1 whenever ( x , w) G 7 Z. To encrypt a witness 
w such that (x, w) G 1Z for some public x, the sender fetches the pair (pk, cert p k) 
from database and runs the randomized encryption algorithm. The latter takes 
as input w, a label L, the receiver’s pair (pk, cert p k) as well as public keys pk GM 
and pk 0A . Its output is a ciphertext ip <— ENC(pk GM , pk 0A , pk, cert p k, w, L). On 
input of the same elements, the certificate cert p k, the ciphertext ip and the ran- 
dom coins coins $ that were used to produce it, the non-inter active algorithm 
V generates a proof that there exists a certified receiver whose public key 
was registered in database and that is able to decrypt ip and obtain a witness w 
such that ( x,w ) G 7Z. The verification algorithm V takes as input ip, pk GM , pk 0A , 
7T0 and the description of 7 Z and outputs 0 or 1. Given ip, L and the receiver’s 
private key sk, the output of DEC is either a witness w such that (x, w) G TZ or a 
rejection symbol J_. Finally, OPEN takes as input a ciphertext/label pair ( ip,L ) 
and the OA’s secret key skoA and returns a receiver’s public key pk. 
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The security model considers four properties termed correctness, message 
security, anonymity and soundness. In the following, we sometimes denote by 
(output A |output B ) <— (^(input^), S(input B ))(common-input) the execution of a 
protocol between A and B obtaining their own outputs from their inputs. 
Correctness. The correctness property requires that the following experiment 
returns 1 with overwhelming probability. 

Experiment Expt correctness (A) 

param <— SETUPmi t (A); (pk K ,sk-R.) <— G r (X)', ( x,w ) <— sample TC (pk TC , sk-^); 
(pk GM ,sk G M) <- SETUPgm ( param); (pk 0A ,sk 0A ) <- SETUP 0 A(param); 

(pk, sk, cert p k|pk, cert pk ) <- (J US er, JGM(sk GM ))(pk GM ); 

<- ENC(pk GM , pk 0A , pk, certp k , w, L); 

^(P k GM ) P k OAi P k > cert i i>> coins^y, 

If ((to ^ DEC(sk, ip, L)) V (pk ± OPEN(sk OA , V’, L)) 

V(V(ip, L, TTijj , pk GM , pk 0A ) = 0)) return 0 else return 1; 

Message Security. The message secrecy property is defined by an experiment 
where the adversary has access to oracles that may be stateful (and maintain a 
state across queries) or stateless: 

- DEC(sk): is a stateless oracle for the user decryption function DEC. When 
this oracle is restricted not to decrypt a ciphertext-label pair ( ip,L ), we 
denote it by DEC 

- CH^ or (A, pk, w, L): is a real-or-random challenge oracle that is only queried 
once. It returns (ip, coins $) such that ip *— ENC(pk GM , pk 0A , pk, certpk, w, L) 
if b = 1 whereas, if b = 0, ip «— ENC(pk GM , pk 0A , pk, certpk, w', L) encrypts a 
random plaintext uniformly chosen in the space of plaintexts of length 0(A). 
In either case, coins ^ are the random coins used to generate ip. 

- PROVE^, ,p/(pk GM , pk 0A , pk, certpk, pk n ,x,w,ip, L,coins^): is a stateful ora- 
cle that the adversary can query on multiple occasions. If b = 1, it runs the 
real prover V on the inputs to produce an actual proof ir$. If b = 0, the 
oracle runs a simulator V' that uses the same inputs as V except witness 
w , coins $ and generates a simulated proof. 

These oracles are used in an experiment where the adversary controls the GM, 
the OA and all members but the honest receiver. The adversary A is the dishon- 
est GM that certifies the honest receiver in an execution of JOIN. She has oracle 
access to the decryption function DEC of that receiver. At the challenge phase, 
she probes the challenge oracle for a label and a pair (x,w) £ 1Z of her choice. 
After the challenge phase, she can also invoke the PROVE oracle on multiple 
occasions and eventually aims to guess the bit b chosen by the challenger. 

As pointed out in designing an efficient simulator V' (for executing 
PROVEp p,(.) when b = 0) is part of the security proof and might require a 
simulated common reference string. 

Definition 5. A GE scheme satisfies message security if, for any PPT adver- 
sary A, the experiment below returns 1 with probability at most 1/2 + negl(A). 
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Experiment Expt^ c (A) 

param <— SETUPi n it(A); (aux, pk GM , pk 0A ) <— A(param); 

(pk, sk, cert pk |aux) <- (J user , A(aux))(pk GM ); 

(aux, x, w, L, pk TC ) <— ^4 DEC ^ sk ’ )(aux); If (x, w) $ 1Z return 0; 
b r- (0, 1}; (ip, coins </,) «— CH} or (A, pk, w, L); 

y ^ 4 PROVE ^ T .'(P k GM,pk 0 A,Pk,cert pk ,pk K ,x, ll ;,V-,i,c 0 m^),DEC-<^’ I ->(sk,.) ( ' auX) ^ . 

If b = b' return 1 else return 0; 

Anonymity. In anonymity attacks, the adversary controls the whole system but 
the opening authority and performs a kind of chosen-ciphertext attack on the 
encryption scheme of the OA. She registers two keys pk 0 , pk , in database and, for 
a pair ( x , w) £TZ of her choosing, obtains an encryption of w under pk h for some 
b £ {0,1} chosen by the challenger. She is granted access to decryption oracles 
w.r.t. both keys pk 0 , pk, . In addition, she may invoke the following oracles: 

- CH{ non (pk GM , pk 0A , pk 0 , pk 1 ,w,L): is a challenge oracle that is only queried 
once by the adversary. It returns a pair (ip, coins,!, ) consisting of a ciphertext 
ip <— ENC(pk GM , pk 0A , pk 6 , cert p k b , w, L) and the coin tosses coins $ that were 
used to generate ip. 

- USER(pk GM ): is a stateful oracle simulating two executions of J user to intro- 
duce two honest users in the group. It uses a string keys where the outputs 
of the two executions are written. 

- OPEN(skoA, -) : is a stateless oracle that simulates the opening algorithm on 
behalf of the OA and, on input of a GE ciphertext, returns the receiver’s 
public key. 

Definition 6. A GE scheme satisfies anonymity if, for any PPT adversary A, 
the experiment below returns 1 with a probability not exceeding 1/2 + negl(A). 

Experiment Expt/J 1011 (A) 

param <— SETUPi„it(A); (pk OA ,skoA) SETUPoA(param); 

(aux, pk GM ) «- A(param, pk 0A ); aux ^uSER(pk GM ),OPEN(sk OA ,.)( aux ) . 

If keys 7 ^ (pk 0 , sk 0 , cert p k 0 , pk 1; ski, cert p k 1 )(aux) return 0; 

(aux, X, W, L, pk TC ) e- ^OPENlskoAO.DECCsko.O.DECCsk!,.)^^, 

If (x, w) <£ 1Z return 0; 

b£- {0,1}; ( ip, coins $) <- CH 3 non (pk GM , pk 0A , pk 0 , pk l5 w, L); 

\j _ 4 'P(pk G M.P k OA.P k !>. cert pk ij .a : .«'.' 0 .- c ',coms^, 

OPEN-<^' i >(sk 0 A,.),DEC-<^'- L >(sko,) l DEC-W- L >(sk 1 ,.)) ( ' auX) ^. 

If b = b' return 1 else return 0; 

As shown in Ea. GE schemes satisfying the above notion necessarily subsume a 
key-private (a.k.a. receiver anonymous) |dl28j cryptosystem. 

Soundness. In a soundness attack, the adversary creates the group of receivers 
by interacting with the honest GM. Her goal is to produce a ciphertext ip and a 
convincing proof that ip is valid w.r.t. a relation TZ of her choice but either (1) 
the opening reveals a receiver’s public key pk that does not belong to any group 
member; (2) the output pk of OPEN is not a valid public key ( i.e ., pk PK., 
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where VIC is the space of valid public keys); ( 3 ) the ciphertext C is not in the 
space C x,i ’ pkTC ’ pkGM,pkoA ’ pk of valid ciphertexts. This notion is formalized by a game 
where the adversary is given access to a user registration oracle REG(skcM,-) 
that simulates Jgm- This oracle maintains a repository database where registered 
public keys and their certificates are stored. 

Definition 7 . A GE scheme is sound if, for any PPT adversary A, the experi- 
ment below returns 1 with negligible probability. 

Experiment Expt^ undness (A) 

param <— SETUPinit(A); (pk OA ,skoA) <— SETUPoA(param); 
(pk GM ,sk G M) <- SETUP G ivi(param); 

(pk TC , x, ip, 7^, L, aux) <- ^REG(sk GMl .) ( parairij p k GM ? p k 0A , sk 0A ); 

IfV(ip, L, P^gmj P^oa) = 0 return 0 ; 
pk <— OPEN(sk OA , ip, L); 

If ((pk £ database) V (pk £ VIC) V {ip 0 c x ’ L ’ pk «’ pk GM- pk oA,P k )) 
then return 1 else return 0; 


2.3 Groth-Sahai Proof Systems 

In the following notations, for equal-dimension vectors A and B containing group 
elements, A 0 B stands for their component- wise product. 

When based on the DLIN assumption, the Groth-Sahai (GS) proof systems 
EZ| use a common reference string comprising vectors g[, <72,33 € G 3 , where 
<71 = ( <71,1,5 ), <72 = (1,52,0) for some 01,02 € G. To commit to X £ G, one 
sets C = ( 1 , 1 , X) © 0i r 0 02 s 0 03* with r, s, t 4 - Z*. When the proof system is 
configured to give perfectly sound proofs, <73 is chosen as 03 = <71®* © g^ with 
£1 , £2 4 - Z*. Commitments C = (g 7 ^^ 1 , g^ +i2t , X ■ g r+s+t ^ 1+ ^) are then Boneh- 
Boyen-Shacham (BBS) ciphertexts that can be decrypted using ol\ = log ff (0i), 
0:2 = log 9 (02). In the witness indistinguishability (WI) setting, vectors <71,02,03 
are linearly independent and C is a perfectly hiding commitment. Under the 
DLIN assumption, the two kinds of CRS are indistinguishable. 

To commit to an exponent i?Z p , one computes C = <p x © g{ r © 02 s , with 
r,s 4 - Z*, using a CRS comprising vectors <p. gl-gh- In the soundness setting 
<0,01,02 are linearly independent vectors (typically (p = 03 0 (1, 1, g) where (p = 
01^ ©02^ 2 ) whereas, in the WI setting, choosing (p = g^ 1 © gives a perfectly 
hiding commitment since C is always a BBS encryption of l{j. 

To prove that committed variables satisfy a set of relations, the GS techniques 
replace variables by the corresponding commitments in each relation. The whole 
proof consists of one commitment per variable and one proof element (made of 
a constant number of group elements) per relation. 

Such proofs are available for pairing-product relations, which are of the type 
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for variables X\, . . . , X n £ G and constants tr e Gt, A\. . . . . A n £ G, e G, 
for i.j 6 {1, , n}. Efficient proofs also exist for multi-exponentiation equations 

for variables X \, . . . , X n £ G, t/i, . . . , y m £ Z p and constants T, Ai, ■ ■ ■ , A rn £ G, 
bi, . . . ,b n £ Z p and 7 y £ G, for i £ {1, . . . , m},j £ {1, . . . , n}. 

Multi-exponentiation equations admit zero-knowledge proofs at no additional 
cost. On a simulated CRS (prepared for the WI setting), a trapdoor makes it is 
possible to simulate proofs without knowing witnesses and simulated proofs are 
perfectly indistinguishable from real proofs. As for pairing- product equations, 
zero-knowledge proofs are often possible but usually come at some expense. In 
the paper, we only resort to such NIZK simulators in one occasion. 

In both cases, proofs for quadratic equations cost 9 group elements. Linear 
pairing-product equations (when a,;j = 0 for all i.j) take 3 group elements 
each. Linear multi-exponentiation equations of the type Ylj =1 X£ = T (resp. 
I!™ 1 = T) demand 3 (resp. 2) group elements. 

3 Building Blocks 

Our certification scheme uses a trapdoor commitment to group elements as an 
important ingredient to dispense with proofs of knowledge of users’ private keys. 

3.1 A Trapdoor Commitment to Group Elements 

We need a trapdoor commitment scheme that allows committing to elements of 
a group G where bilinear map arguments are taken. Commitments will have to 
be themselves elements of G, which prevents us from using Groth’s scheme 
where commitments lie in the range Gt of the pairing. 

Such commitments can be obtained using the perfectly hiding Groth-Sahai 
commitment based on the linear assumption recalled in section 12.31 This com- 
mitment uses a common reference string describing a prime order group G and 
a generator f £ G. The commitment key consists of vectors (/1, / 2 , f 3 ) chosen as 
fi = (/1, 1, /), h = (1, h, f) and f 3 = /1 6 © fs** 0 (1, 1, /) ?3 , with /1, / 2 4- G, 
£ij£2,£3 To commit to X, the sender picks , 02, 03 Z* and sets 

C x = (1, 1, X) 0 /1 1 0 f 2 2 0/3 3 > which, if f 3 is parsed as (/ 3> i, /3,2, /3,3), can 
be written C x = ( ft 1 • , f£ 2 ■ , X ■ 3 ) . Due to the use of GS proofs, 

commitment openings need to only consist of group elements (and no scalar). To 
open Cx = (Ci, G2, G3), the sender reveals ( Di,D 2 ,D 3 ) = (Z^ 1 , /^ 2 , /^ 3 ) and 
X. The receiver is convinced that the committed value was X by checking that 

f e(C 1 ,f) = e(f 1 ,D 1 )-e(f 3A ,D 3 ) 

{ e(C 2 ,/) = e(/ 2 ,C 2 ).e(/3, 2 ,C 3 ) 

{ e(C 3 , /) = e(X ■ D 1 ■ D 2 , f) ■ e(/ 3 , 3 , As). 
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If a cheating sender can come up with distinct openings of Cx, we can easily 
solve a S2P instance (<?i, <?2, 3 i,c» 92, d)- Namely, the commitment key is set as 
(/i,/2,/3,i,/3,2) = {9i,92,9i,c,g2,d) and /, / 3 , 3 are chosen at random. When 
the adversary outputs (X, {D\,D 2 , D 3 )) and (X', {D[,D 2 , D 3 )), we must simul- 
taneously have e{f u D 1 /D , 1 ) = e(/ 3 ,i, D' 3 /D 3 ), e(f 2 ,D 2 / D' 2 ) = e(f 3 , 2 ,D' 3 /D 3 ) 
and edXDiD^/iX'D^D'JJ) = e(f 3 , 3 ,D' 3 /D 3 ). Hence, setting u = 
v = D 2 /D 2 and w = D' 3 /D 3 solves the S 2 P problem as ( u,v,w ) can only be 
trivial if X' = X. 

Using the trapdoor (£i,£ 2, £3), the receiver can equivocate commitments. 
Given a commitment Cx and its opening {X,{D\,D 2 ,D 3 )), one can trapdoor 
open Cx to any other X’ £ G (and without knowing log s (Af')) by computing 

D\ = I\ • {X'/Xf 1 ^, D' 2 = D 2 -(X'/X)^ 3 , D’ 3 = (W/X ') 1/?3 ■ D 3 . 

3.2 A Public Key Certification Scheme 

We use a primitive that we call non-interactive certification scheme , which can 
be viewed as a signature scheme that only allows signing public keys from a 
specific public key space VK. These keys should be signed while retaining alge- 
braic properties that make it possible to prove knowledge of a public key and its 
corresponding certificate in an efficient way. In particular, signing hashed public 
keys is proscribed. In the interactive setting, several papers (e.g., BZ 1 I) describe 
efficient interactive protocols where a public key is jointly generated by a user 
and a certification authority in such a way that the user eventually obtains a 
certified public key and no one else learns the underlying private key. In this pa- 
per, we aim at minimizing the amount of interaction and let users generate their 
public key entirely on their own before requesting their certification. Ideally, we 
would like to be able to sign public keys without even requiring users to prove 
knowledge of their private key and, in particular, without having to first rewind 
a proof of knowledge so as to extract the user’s private key in the security proof. 

A certification scheme consists of algorithms (Setup, Certify, CertVerify). The 
first one is run by a certification authority (CA) that, on input of global param- 
eters cp, generates a key pair ( SK,PK ) <— Setup(cp). On input of cp, SK and 
a user’s public key pk, Certify generates a certificate cert p k. The procedure Verify 
takes as input cp, PK , pk and cert p k and outputs either 0 or 1 . 

Correctness mandates that CertVerify(cp, PK, pk, cert p k) = 1 when cert p k <— 
Certify (cp, SK, pk). The (strong) unforgeability |TJ requirement is the same as in 
signature schemes. The adversary is supplied with a CA’s public key PK and 
access to a certification oracle Certify (.S’TT, .) that can be queried for arbitrary 
public keys pk e VK. Her goal is to produce a new pair (pk*, cert* k ,) {i.e., if pk* 
was queried to Certify (S' A, .), the output must have been different from cert* k »). 

In the description hereafter, we assume common public parameters cp consist- 
ing of of bilinear groups (G, Gt) of prime order p > 2 A , for a security parameter 
A, and a generator j^-G. We also assume that certified public keys always 
consist of a fixed number n of group elements {i.e., VK = G”). 
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Intuition. The scheme borrows from the Boyen- Waters group signature m 
in the use of the HSDH assumption. A simplified version involves a CA that 
holds a public key PK = (1? = g u , A = (g,g) a ,u,u,o,ui = g & 1 , . . . ,u n = g f3n ), 
for private elements SK = (uj, a, /fy .... /?„), where n denotes the number of 
groups elements that certified public keys consist of. To certify a public key 
pk sa (Xi = g xi , . . . ,X n = g Xn ), the CA chooses an exponent cid ■£- Z* and 
computes Si = (<7“) 1 /( < ‘ H " 0|D ), S2 = g c,D , S3 = u c,D , S4, = (u 0 • n”=i X^) 010 
and S5 = (S Bt i , . . . , S B>n ) = (V[ ID , . . . , X £ D ) . Verification then checks whether 
e(Si, (7 ■ S2) = A and e(S%, u) = e(g, S3) as in (EH- It must also be checked that 
e(S 4 , g) = e(u 0 , S 2 ) ■ JliLi e ( u *> s 5,i) and e(S 5 ,i, g) = e(X u S 2 ) for i = 1, . . . , n. 

The security of this simplified scheme can only be proven if, when answering 
certification queries, the simulator can control the private keys (xi , . . . , x n ) and 
force them to be random values of its choice. To allow the simulator to sign ar- 
bitrary public keys without knowing the private keys, we modify the scheme so 
that the CA rather signs commitments (calculated as in the trapdoor commit- 
ment of section El to public key elements X\, . . . , X n . In the security proof, the 
simulator first generates a signature on n commitments C,; = (C,;,i , 2, Cj, 3) to 

1g that are all generated in such a way that it knows log g (Cij) for i = 1 , . . . , n 
and j = 1,2,3. Using the trapdoor of the commitment scheme, it can then open 
Ci to any arbitrary public key element X t without knowing log 9 (V,;). 

This use of the trapdoor commitment is reminiscent of a technique (no- 
tably used in jTHj) to construct signature schemes in the standard model using 
chameleon hash functions m- the simulator first signs messages of its choice 
using a basic signature scheme and then “equivocates” the chameleon hashes to 
make them correspond to adversarially-chosen messages. 

Setup(cp): given common public parameters cp = {g, G, Gr}, select u,uo 
G, a, uj 4- Z* and set A = e(g,g) a , fi = g u . Pick A, 3 Z* 

and set u, = u*, 2, «i, 3) = (g^' 1 , g^’ 2 , g^ 3 ) for i = 1 ,...,n. Choose 

/, fi,f2, /3,i, /,3.2- /3,3 G that define a commitment key consisting of vec- 
tors fi = (/1, 1, /), h = anc i_f 3 = (/ 3 ,i,/3,2,/3,3)- Define the 

private/public key pair as SK = (o, uj, {fj i = (fy.i, A, 2, /?i,3)}i=i,....n) and 

PK = (f = [fufoja), A = e(g,g) a , J? = g u , u, u 0 , {u*}i=i,...,n) • 

Certify (cp, SK, pk): parse SK as (a,w, {/9j}i=i,. pk as (Xi, . . . ,X n ) and do 
the following. 

1. For each * g {1, . . . , n}, pick (f> it 1, 0 j)2 , fa, 3 4- Z* and compute a commit- 



ment Ci = 


and the matching de-commitment (A,i, A, 2, A, 3) = (f^ 1 , /^ <i3 ). 

2. Choose cid Z*, compute Si = (g a ) 1 /( u ’+ ci °), S2 = g c,D , S3 = u c '° and 
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Return cert pk =({(A,i, A, 2 , C i>3 ), (A.i, A, 2, A,3)}i=i,...,», Si, S 2 , S 3 , S 4 , S 5 ). 
CertVerify(cp, PK, pk, cert pk ): parse pk as (Y-, , . . . , X„) and cert pk as above. Re- 


turn 1 if, for 

■i= l,...,n 

it holds that and 




e(C a ,/) 

= e(/i,%) 

■ e(/3,i, A, 3 ) 


(1) 


<Ci,2,f) 

= e(/ 2 ,A,2) 

• e(/3,2, A, 3 ) 


(2) 


e(<W) 

= e(Xi • A,i 

• A, 2 ,/) ' e(f 3 ,s 

A, 3 ), 

(3) 

and if the following checks are also satisfied. Otherwise, return 0. 


e(Si,Q 

■ Sf) = A 




(4) 

e(S 2 ,u) 

= e{g, S 3 ) 




(5) 

e(S 4 ,g) 

= e(u Q ,S 2 ) ■ 

n «***.»' s > 

• e(u ij2 ,S , 5,5 

>, 2 ) • e(u ii 3 ,S , 5 ! j 

, 3 )), (6) 

e (Ss,iJi 

g) = e(C itj , 

S 2 ) for i 

= l,...,n, j = 

1,2,3 

(7) 


A certificate comprises 9n + 4 group elements. It would be interesting to avoid 
this linear dependency on n without destroying the algebraic properties that 
render the scheme compatible with Groth-Sahai proofs. 

Regarding the security of this scheme, the idea of the proof of the following 
theorem is sketched in appendix 0 Due to space limitation, the complete proof 
is detailed in the full version of the paper. 

Theorem 1. The scheme is a secure non-interactive certification system if the 
HSDH, FlexDH and S2P problems are all hard in G. 

We believe that the above certification scheme is of interest in its own right. 
For instance, it can be used to construct non-frameable group signatures that 
are secure in the concurrent join model of m without resorting to random 
oracles. To the best of our knowledge, the Kiayias-Yung construction j3D| has 
remained the only scalable group signature where joining supports concurrency 
at both ends while requiring the smallest amount of interaction. In the standard 
model, our certification scheme thus appears to provide the firs10 way to achieve 
the same result. In this case, we have n = 1 (since prospective group members 
only need to certify one group element if non-frameability is ensured by signing 
messages as in Groth’s group signature W) so that membership certificates 
comprise 13 group elements and their shape is fully compatible with GS proofs. 
2 Non-frameable group signatures described in }1 919) achieve concurrent security by 
having the prospective user generate an extractable commitment to some secret 
exponent (which the simulator can extract without rewinding using the trapdoor of 
the commitment) and prove that the committed value is the discrete log. of a public 
value. In the standard model, this technique requires interaction and the proof should 
be simulatable in zero-knowledge when proving security against framing attacks. 
Another technique eh requires users to prove knowledge of their secret exponent 
using Groth-Sahai non-interactive proofs. It is nevertheless space-demanding as each 
bit of committed exponent requires its own extractable GS commitment. 
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3.3 Public Key Encryption Schemes Based on the Linear Problem 

We need cryptosystems based on the DLIN assumption. The first one is 
Shacham’s variant eh of Cramer-Shoup m and, since it is key- private PJ, 
we use it to encrypt witnesses. We also use Kiltz’s tag-based encryption (TBE) 
scheme EH, where the validity of ciphertexts is publicly verifiable, to encrypt 
receivers’ public keys under the public key of the opening authority. 


Shacham’s Linear Cramer-Shoup. If we assume public generators 51,52,5 
that are parts of public parameters, each receiver’s public key is made of n = 6 
group elements 

Xi = 9i 1 9 x X 3 = gl 3 * g v X 5 = g% 5 g z 

X2 — g% 2 9 X X4 = g<^g y X$ = 92 6 9 z ■ 

To encrypt meG under the label L, the sender picks r, s 4 - Z* and computes 
V>cs = {Ui,U 2 ,U 3 ,U4,U 6 ) = (g[, 52 s , 9 r+s , m-X£Xg, (X^f • (X 2 X 4 “) S ) , 

where a = H{U\, U 2 , U 3 , U4, L) £ Z* is a collision-resistant haslfl Given (V’cs, L), 
the receiver computes a. He returns J_ if f/5 ^ Uf 1 +aX:i [/if 2 +aXi U : ^ +ay and 
m = Ui/p^U^Ui) otherwise. 


Kiltz’s Tag-Based Encryption Scheme. In j 3 H, Kiltz described a TBE 
scheme based on the same assumption. The public key is (Yi, Y^ljs, Y4) = 
(9 yi : 9 V2 , 9 ys , 9 Vi ) if g G G is part of public parameters. To encrypt m £ G 
under a tag t £ Z*, the sender picks w\,w 2 4 - Z* and computes 

i’K = (V u V 2 ,V 3> V 4 ,V 5 ) = (Y 1 w \ YP, pYsp, (g*Y i) W2 , m-g w ' +w A 


To decrypt ipK, the receiver checks that V 3 = v} t+V3 ^ V1 , V4 = \r^ t+Vi ^ V2 . If so, 
it outputs the plaintext m = V 3 /{V^ vl V^ y2 ). Unlike Ucs, the well-formedness 
of i/jk is publicly verifiable in bilinear groups. The Canetti-Halevi-Katz EH! 
paradigm turns this scheme into a full-fledged CCA 2 scheme by deriving the 
tag t from the verification key VK of a one-time signature, the private key SK of 
which is used to sign (V), V2, V 3 , V4, Vj)- 


4 A GE Scheme with Non-interactive Proofs 


We build a non-interactive group encryption scheme for the Diffie-Hellman re- 
lation 1Z = {(X. Y), W} where e(g. W) = e(X,Y), for which the keys are 
pk n = {G,Gt, 5} and sk^. = e. 

The construction slightly departs from the modular design of m in that com- 
mitments to the receiver’s public key and certificate are part of the proof (instead 
of the ciphertext), which simplifies the proof of message-security. The security 
of the scheme eventually relies on the HSDH, FlexDH and DLIN assumptions. 
All security proofs are available in the full version of the paper. 

3 The proof of CCA2-security only requires a universal one-way hash function 

(UOWHF) but collision-resistance is required by the proof of key-privacy in 0. 
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SETUPinit(A): choose bilinear groups (G , Gt) of order p > 2 X , g 4- G and 
5i = 9 ai , 92 = g 012 with ai,at2 Z*. Define g{ = (51,1,5), 92 = (1,52,5) 
and 53 = 5! 6 © g£ 2 with G , & Z*, which form a CRS g = (51,52,53) 

for the perfect soundness setting. Select a strongly unforgeable (as defined 
in HJ) one time signature scheme X = (Q, S. V) and a random member 
H : {0, 1}* — > Ti P of a collision-resistant hash family. Public parameters 
consists of param = {A, G, G t, 5, g, A?, H}. 

SETUPGM(param): runs the setup algorithm of the certification scheme de- 
scribed in section Id. 21 with n = 6. The obtained public key consists of 
pk GM = ^f, A — e(g,g) a , fl = g u , u, uo, {uj}i=i,...,6^ and the match- 
ing private key is sk G M = = (A,i, A,2,/3i, 3 )}»=i,...,6)- 

SETUP 0A (param): generates pk 0A = (Yi,Y 2 ,Y 3 ,Y 4 ) = {g yi ,g V2 ,g y3 ,g yi ), as a 
public key for Kiltz’s tag-based encryption scheme ED, and the correspond- 
ing private key as sk 0A = (51, 52, 53, 54)- 
JOIN: the user sends a linear Cramer-Shoup public key pk = (Xi, . . . , X&) G G 6 
to the GM and obtains a certificate 

cert pk = ({(C M , a, 2 , A, 3 ), (A,i, A, 2 , A, 3 )}i=i,..., 6 , St, St, S 3 , S 4 , S 6 ) . 

ENC(pk GM , pk 0A , pk, certpk, W, L ): to encrypt We G such that ((X, Y), W) 6 TZ 
(for public elements X,Y e G), parse pk GM , pk 0A and pk as above and do 
the following. 

1. Generate a one-time signature key pair (SK, VK) <— Q( A). 

2. Choose r,s^~ Z* and compute a linear CS encryption of W, the result 
of which is denoted by ipcs, under the label L\ = L \ | VK as per section 
13.31 (and using the collision-resistant hash function specified by param). 

3. For i = 1, . . . , 6, choose w^i, uya 4- Z* and encrypt X, : under pk 0A using 
Kiltz’s TBE with the tag VK as described in section EP . Let ipK t be the 
ciphertexts. 

4. Set the ciphertext ip as ip = VK||^cs||'*/’Ki|| ■ • • ||V’k 6 ||o' where a is ob- 
tained as ct = 5(SK, (^cs||V , K 1 ||---]]V , K 6 ||i))- 

Return ( ip,L ) and coins ^ consist of {(wj,i, Wi, 2 )}i=i,..., 6 , (r, s). If the one- 
time signature of m is used, VK and cr take 3 and 2 group elements, 
respectively, so that ip comprises 40 group elements. 
'P{pkGM,pko/\,pKcertpk,(X,Y),W,ip,L,coins^): parse pk GM , pk 0A , pk and ip 
as above. Conduct the following steps. 

1 . Generate commitments (as explained in section 12.311 to the 9n + 4 = 58 
group elements that certpk consists of. The resulting overall commitment 
com cer t pk contains 184 group elements. 

2. Generate commitments to the public key elements pk = (Xi , . . . , Xg) and 
obtain com p k = {corrix % }i=i,...,6; which consists of 18 group elements. 

3. Generate a proof 7r cer t pk that com cer t pl< is a commitment to a valid cer- 
tificate for the public key contained in com p k- For each i = 1,...,6, 
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relations 0-0 cost 9 elements to prove (and thus 54 elements alto- 
gether). The quadratic equation 0 takes 9 elements and linear ones 
0-0 both require 3 elements. Finally, 0 is a set of 18 linear equa- 
tions which demand 54 elements altogether. The whole proof 7r cer t k thus 
takes 123 group elements. 

4. For i = 1,...,6, generate a NIZK proof ir e q-key,i that arniXi (which 
is part of com p k) and i/k, are encryptions of the same X, : . If ip^ com- 
prises (Vi,i, Vi t 2, Vi, 5) = (Y™*' 1 , Xi- g Wi - I+Wi - 2 ) an d comxi is parsed 

as (c Xil ,c Xi2 ,c Xi3 ) = ( 9i 1 ■ g^t 2 ■ 53(2: Xi ■ g e *+ e ™ • 5 |«), where 
Wi,i,Wi t 2 G coins 0a,Oi2,Oi3 G Z* and 53 = (#3,1, 03,2, 53, 3), this 
amounts to prove knowledge of values 8u,0a,0t3 such that 


(Yv Y*_ *k\ 

\ C-Xu ’ Cx i2 ’ Cx i3 ' 



Committing to w^i, w%,2, 6n, 0i2, 0i3 introduces 90 group elements 
whereas the above relations only require two elements each. Overall, 
proof elements tv eq - key a ■ • • ■ , ^ eq -key,6 incur 126 elements. 

5. Generate a NIZK proof n va i- e n,c that ipcs = (C/i, U2, C/3, C/4, C/5) is a valid 
CS encryption. This requires to commit to underlying encryption ex- 
ponents r, s G coins $ and prove that tf% = g\, U2 = g'2- C/3 = g T+s 
(which only takes 3 times 2 elements as base elements are public) and 
C/5 = (XiX^) r (X2X ^ ) 5 (which takes 9 elements since base elements are 
themselves variables). Including commitments com r and com s to expo- 
nents r and s, n va i- en c demands 21 group elements overall. 

6. Generate a NIZK proof n-n that ipcs encrypts a group element W G G 
such that ((X, Y), IF) G 1 Z. To this end, generate a commitment com w = 
{c w ,i,cw,2,c w ,3) = {g^-gs^g^'g^W-g^+^g^) and prove that the 
underlying W is the same as the one for which C/4 = W ■ X£Xg in ipcs- 
In other words, prove knowledge of r, s, 61,62 , 63 such that 

* * 

\Cw,l Cw, 2 Cw, 3/ 

62 02 ■ c? 3 , 2 3 > g 01 02 ' g3, 03 ' x 5'Xg). ( 8 ) 

Commitments to r, s are already part of ir va i-enc- Committing to 61,62 , 63 
takes 9 elements. Proving the first two relations of 0 requires 4 elements 
whereas the third one is quadratic and its proof is 9 elements. Proving 
the linear pairing- product relation e{g, W ) = e(X, Y) in N I Z K0 demands 
9 elements. Since n-n includes comw, it entails a total of 34 elements. 


It requires to introduce an auxiliary variable X and prove that e(g, W) = e(X , Y) 
and X = X, for variables W, X and constants g, X, Y . The two proofs take 3 elements 
each and 3 elements are needed to commit to X. 
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The proof 7 = COTO C ert pk ||comp k ||7r C ert pk ||7r e9 -i : e3 / ,l|| • • • \\^eq-keyfi\\^val-enc\\^n 
eventually takes 516 elements. 

V(param, ip, L, 7r^,, pk G y , pk 0A ): parse pk GM , pk 0A , pk, ip and as above. Re- 
turn 1 if and only if V(VK, u, ('0cs||V-’Ki| ' ■ ■ \P'k 6 ||,£>)) = 1, all proofs verify 
and if ip^ , • • ■ , ipK e are all valid tag-based encryptions w.r.t. the tag VK. 

DEC(sk, ip, L ): parse the ciphertext ip as VK| |"0cs | I^Ki 1 1 • • • ||V’k 6 ||o'. Return _L if 
V(VK,u, (V’csIlV’Ki || • • • llV’Kell-k)) = 0- Otherwise, use sk to decrypt ( ipcs,L ). 

OPEN(skoA, L): parse the ciphertext ip as VK| | -0CS 1 1 V'Ki 1 1 ■ ■ • ||V’Kell°'- Return 
_L if tpKu ■ ■ ■ • V'Kfj are not all valid TBE ciphertexts w.r.t. the tag VK or if 
V(VK, a, (^csllV’Ki II • • • HV’kJI L)) = 0. Otherwise, decrypt ip Kl ,..., ip Ke using 
skoA and return the resulting pk = (Xi , . . . , X e ). 

From an efficiency standpoint, the length of ciphertexts is about 1.25 kB in an 
implementation using symmetric pairings with a 256-bit group order, which is 
more compact than in the Paillier-based scheme of m where ciphertexts take 2.5 
kB using 1024-bit moduli. Moreover, our proofs only require 16.125 kB, which 
is significantly cheaper than in the original GE scheme [29] . where interactive 
proofs reach a communication cost of 70 kB to achieve a 2 -50 knowledge error. 
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A Sketch of the Proof of Theorem [U 

The security proof of the certification scheme considers three kinds of forgeries 
in the attack game. 

- Type I forgeries: are such that the fake certificate cert* k * contains a tuple of 
elements (S 1 *,^, S$) that never appeared in outputs of certification queries. 

- Type II forgeries: are such that cert* k * contains a triple (S'*, S% , S3) that 
appeared in the output of some query but cert* k * also contains commitments 
{((7*-^ (7*2, (7*3 )}i=i that do not match those in the output of that query. 

- Type III forgeries: are such that (,S|;, $!,$}) and {(677, C,* 2 , (7/ 3 )}?:=i,...,n 
are identical in cert* k » and in the output of some certification query. On 
the other hand, the public key pk* = (Y*, . . . , Y*) is not the one that was 
certified in that query. 

Type I forgeries are easily seen to break the HSDH assumption whereas Type 
II and Type III forgeries give rise to algorithms solving the FlexDH and S2P 
problems, respectively. Due to space limitations, the details are deferred to the 
full version of the paper. □ 
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Abstract. Predicate encryption is a recent generalization of identity- 
based encryption (IBE), broadcast encryption, attribute-based encryp- 
tion, and more. A natural question is whether there exist black-box 
constructions of predicate encryption based on generic building blocks, 
e.g., trapdoor permutations. Boneh et al. (FOCS 2008) recently gave a 
negative answer for the specific case of IBE. 

We show both negative and positive results. First, we identify a com- 
binatorial property on the sets of predicates/attributes and show that, 
for any sets having this property, no black-box construction of predicate 
encryption from trapdoor permutations (or even CCA-secure encryption) 
is possible. Our framework implies the result of Boneh et al. as a special 
case, and also rules out, e.g., black-box constructions of forward-secure 
encryption and broadcast encryption (with many excluded users). On 
the positive side, we identify conditions under which predicate encryp- 
tion schemes can be constructed based on any CPA-secure (standard) 
encryption scheme. 


1 Introduction 

In a predicate encryption scheme ffill 3 \ an authority generates a master public 
key and a master secret key, and uses the master secret key to derive personal 
secret keys for individual users. A personal secret key corresponds to a pred- 
icate in some class T, and ciphertexts are associated (by the sender) with an 
attribute in some set A; a ciphertext associated with the attribute I G A can be 
decrypted by a secret key SK / corresponding to the predicate / G T if and only 
if /(/) = 1. The basic security guarantee provided by such schemes is that a 
ciphertext associated with an attribute I hides all information about the under- 
lying message unless one has a personal secret key giving the explicit ability to 
decrypt; in other words, if an adversary A holds keys SK f t , . . . , SKf ( for which 
fi(I) = ■ ■■ = fe(I) = 0, then A should learn nothing about the message. (A 
formal definition is given later.) 

By choosing T and A appropriately, predicate encryption yields as special 
cases many notions that are interesting in their own right. For example, by taking 

* Work done while visiting IBM. Research supported by DARPA, and by the US Army 
Research Laboratory and the UK Ministry of Defence under agreement number 
W911NF-06- 3-0001. 
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A = {0, 1}" and letting T = {fiL>}iDe{ o,i} n t> e the cl ass of point functions 
(so that fiD(ID') = 1 iff ID = ID') we recover the notion of identity-based 
encryption (IBE) jl Ql4j . Similarly, it can be observed that predicate encryption 
encompasses fuzzy IBE jT^j , forward-secure (public-key) encryption jZj , (public- 
key) broadcast encryption jOJ, attribute-based encryption and more as 

special cases. 

Most (though not all) existing constructions of predicate encryption schemes 
rely on bilinear maps. A natural question is: what are the minimal assumptions 
on which predicate encryption can be based? Of course, the answer will depend 
on the specific predicate class IF and attribute set A of interest; in particular, 
Boneh and Waters |Hj show that if T is polynomial size then (for any A) one can 
construct a predicate encryption scheme for {IF, A) from any (standard) public- 
key encryption scheme. On the other hand, Boneh et al. jS] have recently shown 
that there is no black-box construction of IBE from trapdoor permutations. 

1.1 Our Results 

The specific question we consider is: for which {IF, A) can we construct a predicate 
encryption scheme over {J 7 , A) based on CPA-secure encryption? We show both 
negative and positive results. Before describing these results in more detail, we 
provide some background intuition. 

A natural combinatorial construction of a predicate encryption scheme over 
some {IF, A) from a CPA-secure encryption scheme (Gen, Enc, Dec) is as follows: 
The authority includes several public keys pki , . . . , pk q in the master public 
key, and each personal secret key is some subset of the corresponding secret 
keys ski, ■ ■ ■ , sk q . Encryption of a message m with respect to an attribute I re- 
quires “sharing” m in some way to yield m\, . . . ,m q , and the resulting ciphertext 
is Enc p fe 1 (mi), . . . , Enc p k q {m q )- Intuitively, this works if: 

Correctness: Let SKf = {ski, ■ ■ ■ ■ , sfcj t } be a personal secret key for which 
f{I) = 1. Then the “shares” m^, . . . , m H should enable recovery of m. 
Security: Let {ski ± , . . . ,ski h } = U/e^ : /(/)=o SKf. Then the set of “shares” 
rrii, , mi k should leak no information about m0 

Roughly, our negative result can be interpreted as showing that this is essentially 
the only way to construct predicate encryption (in a black-box way) from CPA- 
secure encryption; our positive result shows how to implement the above for a 
specific class of predicate encryption schemes. We now provide further details. 

Impossibility results. Our negative results are in the same model used by 
Boneh et al. jS| , which builds on the model used in the seminal work of Impagli- 
azzo and Rudich m Specifically, as in (3 our negative results hold relative to 
a random oracle (with trapdoor) and so rule out black-box constructions from 
trapdoor permutations as well as from any (standard) CCA-secure public-key 
encryption scheme. 

1 This is stronger than what is required, but makes sense in a black-box setting where 

computational hardness comes only from the underlying CPA-secure scheme. 
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A slightly informal statement of our result follows. Fix {(F„,A n )} n£N , a se- 
quence of predicate classes and attribute sets indexed by the security parame- 
ter n. We say that {(fF n , A„)} n can be q-covered if for every set system {Sf} f^r n 
with Sf C [q(n)} ([5] = f {1, . . . , q}), there are polynomially-many predicates 
/*, fi , . . . , f p 6 T n such that, with high probability: 

1 . s f *£ 

2. There exists an I e A„ with /1 (/) = ••• = f P (I ) = 0 but /*(/) = 1. 

A„)} n is easily covered if it is ^-covered for every polynomial q. We show: 

Theorem. If {(.F n ,A„)} ra is easily covered, there is no black-box construction 
of a predicate encryption scheme over {(FnAn)} n based on trapdoor permuta- 
tions (or CCA-secure encryption). 

Intuitively, if {(.F„, An)}^ is easily covered then the combinatorial approach dis- 
cussed earlier cannot work: letting q[n) be the (necessarily) polynomial number 
of keys for the underlying (standard) encryption scheme, no matter how the se- 
cret keys {ski}f =1 are apportioned to the personal secret keys {SKf} /gjc- n , an 
adversary can carry out the following attack (cf. Definition 0 below): 

1. Request the keys SKf i: . . . , SKf p , where each SKf i ={ski , . . . , } C {sfcj}? =1 . 

2. Request the challenge ciphertext C to be encrypted using an attribute I for 
which /]_(/) = ••• = / p (J ) = 0 but /*(/) = 1. 

3. Compute the key SKf* C (Jj SKf and use this key to decrypt C. 

This constitutes a valid attack since SKf . suffices to decrypt C yet the adversary 
only requested SKf , , . . . , SKf p , none of which suffices on its own to decrypt C. 

Turning this intuition into a formal proof must, in particular, implicitly show 
that the combinatorial approach sketched earlier is essentially the only black-box 
approach to building predicate encryption schemes from trapdoor permutations. 
Moreover, we actuahy prove a stronger quantitative version of the above theorem 
showing, roughly, that if {(.F n , A„)} n is Q-covered then any predicate encryption 
scheme over {(jF n , A„)} n must use at least q + 1 underlying encryption keys. 

One might wonder whether the “easily covered” condition is useful for de- 
termining whether there exist black-box constructions of predicate encryption 
schemes over {(A n , A n )} n of interest. We show that it is, in that the following 
corollary can be proven fairly easily given the above: 

Corollary. There are no black-box constructions of (1) identity-based encryp- 
tion, (2) forward-secure encryption (for a super-polynomial number of time pe- 
riods), or (3) broadcast encryption (where a super-polynomial number of users 
can be excluded) from trapdoor permutations. 

The first result was shown in |5J; the point is that our impossibility result strictly 
generalizes theirs. Moreover, as indicated earlier, we prove a quantitative version 
of their result (as well as all other results stated in the above corohary). 

Positive result. On the positive side, we show that the combinatorial approach 
suggested at the outset can be implemented for {(F„,A n )} n having the following 
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property: for each I G A n there are at most polynomially-many f G T n for which 
/(/) = 0; i.e., for each I there are at most polynomially-many predicates that 
are “excluded”. (The positive result from 0, where there are only polynomially- 
many predicates, is thus obtained as a corollary.) This is proved by analogy to 
broadcast encryption, using the combinatorial techniques from m 

1.2 Comparison to the Results of Boneh et al. 

Our proof relies heavily on the impossibility result from J5j . Our contribution 
lies in finding the right combinatorial generalization (specifically, the “easily 
covered” property described earlier) of the specific property used by Boneh et al. 
for the particular case of IBE, adapting their proof to our setting, and applying 
their ideas to the more general case of predicate encryption. Our generalization, 
in turn, allows us to show impossibility for several cryptosystems of interest 
besides IBE (cf. the corollary stated earlier), as well as to give quantitative 
versions of their earlier result. Our positive results have no analogue in j^j. 

2 Definitions 

2.1 Predicate Encryption 

We provide a functional definition of predicate encryption, followed by a weak 
definition of security that we use when proving impossibility and the standard 
definition of security ca that we use when proving our positive result. 

Definition 1. Fix {(^n, A.„)} ri£N , where T n is a set of (efficiently computable) 
predicates over the set of attributes A n . A predicate encryption scheme over 
n£ N consists of four ppt algorithms (Setup, KeyGen, Enc, Dec) such that: 

— Setup is a deterministic algorithm that takes as input a master secret key 
MSK G {0,1}” and outputs a master public key MPK. 

— KeyGen is a deterministic algorithm that takes as input the master secret key 
MSK and a predicate /G.F„ an d outputs a secret key SKf=KeyGen MSK (J). 
(The assumption that KeyGen is deterministic is without loss of generality, 
since MSK may include a key for a pseudorandom function.) 

— Enc takes as input the public key MPK, an attribute I G A n , and a bit b. It 
outputs a ciphertext C <— Enc MPK(I,b). 

— Dec takes as input a secret key SKf and ciphertext C. It outputs either a 
bit b or the distinguished symbol Y. 

It is required that for all n, all MSK G {0,1}” and MPK = Setup(MS'ir), 
all f G T n and SKf = KeyGen MSK (f), all I G A n , and all b G {0, 1}, that if 
f(I) = 1 then Decgi<: / (EncMPif(d, b)) = b. 

Definition 2. A predicate encryption scheme over (F, A) is weakly payload hid- 
ing if the advantage of any ppt adversary A in the following game is negligible: 
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1. .4(1”) outputs I* g A„ and (/i, . . . , f p ) g T n such that fi(I*) = 0 for all i. 

2. Choose MSK <- {0, l} n ; let MPK := Setup(MSA) and set SK fi := 
KeyGen {MSK, ff) for all i. Choose b <— {0,1}, and compute the ciphertext 
C* <- EncMPif (I*, 6). Then A is given {MPK, SK h ,. . . , SK fp ,C*). 

3. A outputs b' and succeeds if b' = b. 

The advantage of A is defined as |Pr[4. succeeds] — 11 - 

Definition 3. A predicate encryption scheme over {P,K) is payload hiding if 
the advantage of any ppt adversary A in the following game is negligible: 

1. A random MSK g {0, 1}” is chosen, and A is given MPK :=Setup(MSK). 

2. A adaptively requests keys SKf x , . . . corresponding to predicates f%, . . . g T n . 

3. At some point, A outputs I* g A n . A random b g {0, 1} is chosen and A is 
given the ciphertext C* «— Enc mpk{I* , b). A may continue to request keys 
for predicates of its choice. 

4- A outputs b' and succeeds if (1) A never requested a key for a predicate f 
with f(I*) = 1, and (2) b' = b. 

The advantage of A is defined as |Pr[4. succeeds] — \\- 

Our construction of Section 0ca.ii be modified to achieve the even stronger notion 
of attribute hiding-, we refer to ca for a definition. 

2.2 A Random Trapdoor Permutation Oracle 

We assume the reader is familiar with the usual model in which black-box impos- 
sibility results are proved; see [1211715] for further details. We show an oracle O 
relative to which trapdoor permutations and CCA-secure encryption exist, yet 
any construction of a predicate encryption scheme (for certain {4F, A)) relative 
to O is insecure against a polynomial-time adversary given access to O and a 
PSPACE oracle. Our oracle O = (g, e, d) is defined as follows, for each n g N: 

— g is chosen uniformly from the space of permutations on {0, 1}". We view g 
as taking a secret key sk as input, and returning a public key pk. 

— e : {0, l} n X {0, 1}” — > {0, l} 71 maps a public key pk and a “message” 
m g {0, 1}" to a “ciphertext” c g {0, 1}". It is chosen uniformly subject 
to the constraint that e{pk, •) is a permutation on {0, 1}" for every pk. 

— d : {0, 1}" x {0, 1}" — > {0, 1}" maps a secret key sk and a ciphertext c 
to a message m. We require that d{sk, c) outputs the unique m for which 
e{g{sk),m) = c. 

With overwhelming probability O is a trapdoor permutation nm Moreover, 
since the components of O are chosen at random subject to the above con- 
straints (and not with some “defect” as in, e.g., jTQ]), O implies CCA-secure 
encryption 0. 

We denote a query a to O as, e.g., a = f [g{sk) = pk] and similarly for e and 
d queries. In describing our attack in the next section, we often use a partial 
oracle O' that is defined only on some subset of the possible inputs. We always 
enforce that such oracles be consistent: 
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Definition 4. A partial oracle O' = ( g',e',d ') is consistent if: 

1. For every pk £ {0, 1}", the (partial) function e'(pk, ■) is one-to-one. 

2. For every sk £ {0, 1}", the (partial) function d'(sk, ■) is one-to-one. 

3. For all x £ {0, 1}", and all sk such that g'(sk) = pk is defined, the value 
e'(pk,x ) = c is defined if and only if d'(sk,c) = x is defined. 

3 An Impossibility Result for Predicate Encryption 

We define a combinatorial property on (F n ,A n ) and formally state our impossi- 
bility result. We describe in Section mi an adversary A attacking any black-box 
construction of a predicate encryption scheme satisfying the conditions of our 
theorem; an analysis of A is given in Appendix E] and the full version. 

Fix a set T and a positive integer q, and let [q] = f {1, . . . , q}. An T-set system 
over [g] is a collection of sets {Sf} f^jr where each f £ T is associated with a 
set Sf C [q\. 

Definition 5. Let {(F n ,A n )} n£ n he a sequence of predicates and attributes. We 
say {{F n , A n )}„ G N can be q - covered if there exist ppt algorithms (Ai, A%, A3), 
where ^2(1",/) is deterministic and outputs I £ A n with f{I ) = 1, such that 
for n sufficiently large: 

For any T n -set system {Sf}f^^ n over [q(n)\, if we compute 

r^Adl"); J*:=A 3 (1",/*); /1, • ■ ■ , f P - 4,(1", /*), 

then with probability at least 4/5, 

1. S f .C\JS fi ; 

2. fi(I*) = 0 for all i. 

is easily covered if it can be q-covered for every polynomial q. 

Although the above definition may seem rather complex and hard to use, we 
show in Section 0 that it can be applied quite easily to several interesting classes 
of predicate encryption schemes. Moreover, the definition is natural given the 
attack we will describe in the following section. 

A black-box construction of predicate encryption is q-bounded if each of its 
algorithms makes at most q queries to O. We now state our main result: 

Theorem 1 . 7/{(.F n ,A n )} can he q-covered, then there is no q-bounded black- 
box construction of a weakly payload-hiding predicate encryption scheme over 
{(.F n ,A n )} from trapdoor permutations (or CCA-secure encryption). 

Since each algorithm defining the predicate encryption scheme can make at most 
polynomially-many queries to its oracle, we have 

Corollary 1 . If {(F n , A„)} is easily covered, there is no black-box construction 
of a weakly payload-hiding predicate encryption scheme over {(.F^An)} from 
trapdoor permutations (or CCA-secure encryption). 
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3.1 The Attack 

Fix an {(.F n , A„)} that can be g-covered, and let PE = (Setup, KeyGen, Enc, Dec) 
be a predicate encryption scheme over {(7>i,A n )} each of whose algorithms 
makes at most q = poly(n) queries to 0 = ( g,e,d ). We assume, without loss of 
generality, that before any algorithm of PE makes a query of the form [d(sk, *)], 
it first makes the query [g(sfc)]. 

We begin the proof of Theorem |T| by describing an adversary A attacking PE. 
Adversary A is given access to 0 and makes a polynomial number of calls to this 
oracle; as described, A is not efficient but it runs in polynomial time given access 
to a PSPACE-complete oracle (or if V = J\fV) and this suffices to prove black- 
box impossibility as in previous work jl 211 7l5j . Our description of the attack is 
directly motivated by the attacker described in pj . 

Let Ai,A 2 , and A :i be as guaranteed by Definitional and let p = poly(n) 
bound the number of predicates output by A3. Throughout A’s execution, when 
it makes a query to 0 it stores the query and the response in a list L. We also 
require that before A makes any query of the form [d(sk, *)], it first makes the 
query [g(sfc)j. Furthermore, once the query [g(sfc) = pk) has been made then 
[e(pk, x ) = y\ is added to L if and only if [d(sk, y) = x) is added to L. 

Setup and challenge. A( 1") computes /* <— Ai(l"), I* := ^(l”, /*), and 
(A, • • • , fp) <— ^. 3 (1"; /*)• Then: 

1. If A (I*) = 0 for all i, then A outputs ■ ■ ■ , f p ) and receives the values 

(MPK, SKf x , , SK f p , C*) from the challenger (cf. Definition 2). 

2. Otherwise, A aborts and outputs a random bit b' <— {0, 1}. 

Step 1: Discovering important public keys. For i = 1 to p, adversary A 
does the following: 

1. Compute 7/ 4 = A 2 ( 1", A), and choose random b <— {0, 1} and r <— {0, 1}". 

2. Compute Dec® Kj (^Er\c® fPK (If i . b: r)J, storing all 0-queries in the list L. 

Step 2: Discovering frequent queries for I*. A repeats the following q-p 1 * 3 
times: Choose random b <— {0,1} and r <— {0,1}"; compute Enc® 4PK (I* . b: r), 
storing all 0-queries in L. 

Step 3: Discovering secret queries and decrypting the challenge. A 

chooses k <— [q ■ p :i ] and runs the following k times. 

1. A uniformly generates a secret key MSK' and a consistent partial ora- 

cle O' for which (1) Setup 0 ' {MSK') = MPK ; (2) for all i it holds that 
KeyGen^ SJs: ,(A) = SKfp. (3) the oracle O' is consistent with L; and (4) the 
key SK'f, = f KeyGen° /s/f , (/*) is well-defined. 

We denote by L' the set of queries in O' that are not in L (the “invented 
queries”). Note that | L'\< q-(p+ 2), since at most q queries are made by Setup 
and KeyGen (/) makes at most q queries for each of SKf*,SKf lt . . . , SKf p . 
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2. A chooses b <— {0, 1} and r <— {0, 1}", and computes C := Enc^ IPK (I* ,b] r ) 
(storing all 0-queries in L). For an oracle O" defined below, A then does: 

(a) In iteration k' < k, adversary A computes Dec 0 ^ ( C ). 

(b) In iteration k, adversary A computes b' = Dec < s K > 

Output: A Outputs the bit b' computed in the fc th iteration of step 3. 

Before defining the oracle O" used above, we introduce some notation. Let L, 
O', and MSK' be as above, and note that we can view L and O' as a tuple of 
(partial) functions ( g,e,d ) and ( g',e',d ') where g',e' , and d! extend g,e, and d, 
respectively. Define the following: 

— Q! s is the set of pk for which [g'(.sk) = pk) is queried during computation of 
Setup 0 ' {MSK'). 

— Q' k is the set of pk for which [g'(sk) = pk] is queried during computation of 
KeyGen msk’U) for some /€{/*, /i, • ■ • , f p }. 

~ Q'k-s = Q'k \ Q's- 

— L g is the set of pk for which the query [ g(sk ) = pk] is in L. 

Note that A can compute each of these sets from its view. Note further that 
Q' s , Q' k , Q' k _ s i O' are fixed throughout an iteration of step 3, but L g may 
change as queries are answered. 

Oracle O" is defined as follows. For any query whose answer is defined by O', 
return that answer. Otherwise: 

1. For an encryption query e(pk,x) with pk £ Q'k-s \ return a random 
y consistent with the rest of O". Act analogously for a decryption query 
d(sk, y) with pk £ Q'k-s \ Lg (where pk = g(sk)). 

2. For a decryption query d(sk,y), if there exists a pk with [g(sk) = pk] £ O' 
l)iii0 there exists an sk' ^ sk with [g(sk') = pk] £ L, then use O" to answer 
the query d(sk',y). 

3. In any other case, query the real oracle O and return the result. Store the 
query/answer in L (note that this might affect L g as well). 

An analysis of A, proving Theorem QJ appears in Appendix Eland the full version 
of our paper. The analysis is very similar to the one given in jS|, with the main 
difference being Proposition [H 

4 Impossibility for Specific Cases 

We use Theorem [I] to rule out black-box constructions of predicate encryption 
schemes in several specific cases of interest. Specifically, we consider the cases of 
identity-based encryption, forward-secure encryption, and broadcast encryption. 
We begin with a useful lemma. 

2 Although O' is chosen to be consistent, a conflict can occur since L is updated as A 
makes additional queries to the real oracle O. 
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Lemma 1. Fix q(-), and assume {(.F„, A„)} ne N has the following property: For 
sufficiently large n, there exist fi,- ■ ■ , fs q G T n and h, - ■ ■ ,hq € A n such that: 

For all i £ {1, , 5 q} it holds that fi(Ii) = 1 but fj(Ii) = 0 for j > i. 

Then {(.F„, A n )}„ e N can be q-covered. If the above holds for every polynomial q, 
then {(^n, A„)}„ e N is easily covered. 

Proof. We show that, under the stated assumption, {(jF n , A n )} nG pj satisfies Def- 
inition |S1 Fix q and n large enough so that the condition of the lemma holds, 
and let /i , . . . , fy jq and h, ■■■,hq be as stated. Define algorithms A\, A 2 , As as 
follows: 


1. Ai(l") chooses i <— {0, , 5q} and outputs /* = /}. 

2. A 2 (l n , /*) finds i for which f* = fi and outputs I* = Ii. 

3. ^3(1",/*) finds i for which /* = /j and outputs /i+i, . . . , f$ q . (If i = 5q 
then output nothing.) 

Note that A 2 (l n , /*) always outputs I* with /*(/*) = 1. We show that for any 
JP„-set system {Sf}f e r„ over [q], the conditions of Definition 0 hold. We begin 
with the following claim: 

Claim. For any JG„-set system { ( S/}/^jj t over [5], there are at most q values 
i G {1, . . . , 5q} for which S fi <£ Sfj. (By convention, the union is the 

empty set if j = 5 q.) 

Proof. Define Si = f Ui< 7 <5 9 with S 5q = 0. Note that Si_i = S, U Sf,, and 
so % g Ui<i<B 9 s fi = S ^ iff Si S Si-1- Since 

S 5 ? CS 5 r iC...CS 1 C[ 9 ], 


there can be at most q indices i where this occurs. □ 

Fixing an arbitrary JF n -set system {S'/} fer n over [<7], let I C {l, . . . , 5c/} be the 
set of indices for which S'/ C Ui</< 9 the claim above shows that |I| > 4 q. 
If A\ chooses i G I then: 

1 . Sf.=S fi C\J. <j<q S fr 

2. //(I*) = fj(Ii) = fi for all the predicates /,+i, . . . , f q output by A3. 

Since A\ chooses i G I with probability 4/5, this proves the lemma. □ 

We now apply Lemma □ to several specific cases. 

Identity-based encryption. It is easy to see that IBE for identities {T n } 
can be viewed as an instance of predicate encryption by setting A n = I n and 
P n = \ f id} id ex n where 

clef / 1 if ID' = ID 
frniJD ) - j Q otherwise " 

Let N = \I n \ denote the size of the identity space. Boneh et al. j^j already 
rule out black-box constructions of IBE from trapdoor permutations for N = 
cu(poly(n)); the next theorem shows that our Theorem 0 generalizes their result: 
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Theorem 2. There is no black-box construction (from trapdoor permutations 
or CCA-secure encryption) of an IBE scheme for 5 N identities where each al- 
gorithm makes fewer than N queries to its oracle. 

As a corollary, there is no black-box construction of an IBE scheme (from 
trapdoor permutations or CCA-secure encryption) for a super-polynomial number 
of identities. 

Proof. Let I n = {IDi, . . . ,ID 5N }. It is not hard to see that {{^F n , A ri )} we H 
can be A'- covered: take fin, ■ ■ ■ ■ , fiD 5N and set = IDi for all i. Then apply 
Theorem El □ 


Forward-secure public-key encryption. In a forward-secure public-key en- 
cryption scheme 0 secret keys are associated with time periods; the secret key 
at time period i enables decryption for ciphertexts encrypted at any time j > i. 
(We refer the reader to 0 for further discussion.) A forward-secure encryption 
scheme supporting N = N(n) time periods can be cast as a predicate encryption 
scheme by letting A„ = {1, . . . , N} and T n = {fi}i<i<N where 


fiU) = { 


1 if j > i 
0 otherwise 


(A forward-secure encryption scheme imposes the additional requirement that 
SKf i+1 can be derived from SKfp. since we do not impose this requirement 
our impossibility result is even stronger.) A black-box construction of a forward- 
secure encryption scheme from any CPA-secure encryption scheme exists for any 
N = poly(n): the master public key contains public keys {pk\ , . . . , pkn}, and the 
secret key at period i is SKf,. = {ski, • • • , sAv}; encryption at period j uses pkj. 
While such a scheme is trivial as far as forward-secure encryption goes (since 
the public/secret key lengths are linear in N), it satisfies the definition. The 
next theorem indicates that, in some sense, this trivial construction is almost 
optimal as far as black-box constructions are concerned; moreover, there is no 
black-box construction supporting a super-polynomial number of time periods. 
(In contrast, there exist schemes based on specific assumptions j7i.il that support 
an unbounded number of time periods.) 

Theorem 3. There is no black-box construction (from trapdoor permutations or 
CCA-secure encryption) of a forward-secure encryption scheme for 5 N periods 
where each algorithm in the scheme makes fewer than N queries to its oracle. 

As a corollary, there is no black-box construction of a forward- secure encryp- 
tion scheme (from trapdoor permutations or CCA-secure encryption) supporting 
a super-polynomial number of time periods. 


Proof. {(Tn, A„)}„ eN can be N -cowered, as taking /i, . . . , /sjv and setting = i 
for all i satisfies the conditions of Lemma 0 Then apply Theorem 0 □ 


Broadcast encryption. Finally, we look at the case of (public-key) broadcast 
encryption 0. Here, there is a fixed public key and a set of users U = {1, . . . ,U{ 
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each with their own personal secret key; it should be possible for a sender to 
encrypt a message in such a way that only some subset U' C U of users can 
decrypt. Consider the case where at most k = k(n ) < U users are excluded; 
we refer to this as k-exclusion broadcast encryption. This can also be modeled 
by predicate encryption, if we let A„ = {W C U \ \U'\ > U — k} and define 
F n = {fi}ieu where 


fi{W) d = 


1 

0 otherwise 


Theorem 4. There is no black-box construction (from trapdoor permutations or 
CCA-secure encryption) of a (5k) -exclusion broadcast encryption scheme where 
each algorithm in the scheme makes k or fewer queries to its oracle. 

As a corollary, there is no black-box construction of a k-exclusion broadcast 
encryption scheme (from trapdoor permutations or CCA-secure encryption) for 
super-polynomial k. 


Proof. We show that {(JP n , A rl )} ne rsj can be fc-covered. Take /i, . . . , fsk and de- 
fine 

Ii = U\{i,...,5k} 

for i G {1, . . . , 5k}. (So Is*, = U.) Note that |I|| > U — 5k always, and these 
satisfy the conditions of Lemma [I] Applying Theorem 0 concludes the proof. □ 


5 A Possibility Result for Predicate Encryption 

Here we show that for the class of predicates and attributes {(iF n , A n )} where 
(roughly) for each I G A„ there are at most polynomially-many / G T n with 
/(/) = 0, there is a black-box construction of a predicate encryption scheme 
over {((Fri: A„)} based on any CPA-secure encryption scheme. We remark that 
while we only prove payload hiding, our construction can in fact be shown to be 
attribute hiding H3 as well. 

Our construction relies on the notion of an (N, k)- cover free family A : 

Definition 6. An (N, k)- cover free family over [17] is a family S = {Si, . . . , Sn}, 
with Si C [U], such that for any distinct sets S,Si,...,Sk G S it holds that 

S \ UiU Si ± 0. 

For any k = poly(n) and N = 2 pol A") there exist |1 411 fij explicit, polynomial- 
time constructions of an (N,k)-caver free family over [17] with 17 = poly(n). 
(The specific results of can be used to improve the efficiency of the con- 

struction that follows, but our only goal here is to show a construction that can 
be implemented in polynomial time.) 

Theorem 5. Fix {(F n , A„)} and set Neg z = f {/ G T n : /(/) = 0} for I G A n . If 
there is a poly-time algorithm ListNeg for which ListNeg(l”, I) = Neg /; then there 
is a black-box construction of a predicate encryption scheme over {(^A, A„)} 
from any CPA-secure encryption scheme. 
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Proof. Since ListNeg runs in polynomial time, there is a polynomial k for which 

Neg, < k(n) for all I £ A n . Say predicates in T n can be represented using 
i(n) = poly(n) bits. Let {U n } be such that U n = poly(n) and such that, for 
each n, there is an explicit (2 f ( n \ fc(n) {-cover free family S = {Si , . . . , S 2 nn ) } 
over [U n ]. Identifying T n with a subset of [2^")], we can view the cover-free 
family as S = {S/}/ e jr n . 

Let (Gen', Enc', Dec') be a CPA-secure encryption scheme. Our construction 
of a predicate encryption scheme over {(.F n , A„)} is as follows: 

- Setup, on input 1" and a sufficiently long random string MSK, runs Gen'(l") 
a total of U = U n times to generate keys (pk \ , ski )■■■■■ (pku , sku ) • The 
master public key is {pki, . . . ,pku}- 

— KeyGen, given the secret keys {ski}f =l and a predicate / £ T n , outputs the 
subset {ski} ie Sf 

— Enc, given the public key, an attribute I £ A n , and a message m, computes 
Negj = ListNeg(J) and sets U = [U] \ (U/eNe g/ ^/)- The ciphertext is 
(T. {CiLec/) where C* EnCp fe .(m). 

- Dec, given the secret key {ski}i e s f for a predicate / and a ciphertext 
(I, {Ci} ie jj) for which f(I) = 1, first finds an index i for which i £ Sf DU. 
(Such an index must exist, since 

Sf\U = S f \\J r:rm ^S r , 

and there are at most k predicates f that the union is taken over.) The 
output is Dec^. (C;). 

It is easy to see that the above construction satisfies correctness. We now prove 
security (in the sense of Definition 0J). Let A be an adversary attacking the 
scheme. We may assume without loss of generality that A never requests a 
secret key for a predicate / for which /(J*) = 1 (where I* is the attribute used 
to encrypt the challenge ciphertext), since A cannot succeed if that occurs. 

For simplicity we prove security in a non-uniform model, but the proof can be 
modified easily to hold in the uniform model in the standard way. We consider 
U + 1 hybrid experiments Ho , . . . , Hu+ 1 , where Hq corresponds to the experiment 
of Definition0when b = 0 is encrypted, and Hu+i corresponds to the experiment 
of Definition 0 when b = 1 is encrypted. Let A denote the probability that A 
outputs ‘0’ in Hi. We show that A — <5j + i| is negligible for all v. since U = U n 
is polynomial in n, this proves that |<Jo — <5f/+i| is negligible and thus completes 
the proof. 

Experiment Hi is defined as follows: Steps 1 and 2 are exactly as in Defi- 
nition m In step 3, however, when encrypting the challenge ciphertext for the 
attribute /*, let U* = \U\ \ Neg/, and set the ciphertext equal to (/, {C( 7 } je j/,), 
where 

r / EnSfcjWj 

°^\Enc; fc3 (0 )j>i- 

A may continue to request secret keys as in Definition 0 
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We now prove that \Sj — Sj + i is negligible for any j. Fix j and consider the 
following adversary A' attacking the underlying encryption scheme (Gen 7 , Enc 7 , 
Dec 7 ). Given public key pk and ciphertext C (which is either an encryption of 0 
or 1), the adversary A! proceeds as follows: 

1. Set pkj = pk. For i ^ j, compute {pk n . .ski) <— Gen 7 (l"). Give the master 
public key {pk\, . . . ,pku} to A. 

2. When A requests a secret key for a predicate /, then if j g Sf give to A the 
secret keys { ski}i e s f ■ Otherwise, abort and output a random bit. 

3. When A outputs I*, compute Negj. = ListNeg(/*) and then set 



If j £ U* then abort and output a random bit. Otherwise, give A the ci- 
phertext (I, {Ci} i& Q*) where 



4. Subsequent secret key queries made by A are answered as before. Finally, 
A! outputs whatever bit is output by A. 

Let Prj[-] denote the probability of an event in experiment Hj. We have 

|Pr[4 7 outputs 0 | C *— EnCy fc (0)] — Pr[4 7 outputs 0 | C «— EnCp fc (l)]| 

= |Pr [j E U*] ■ Pi'j [.4 outputs 0 | j S U*] 

— Pr [j E U*] ■ Prj+i [4 outputs 0 | j E U*] \ , 

using the facts that (1) Pr[j S U*] is independent of whether C is an encryption 
of 0 or 1 and (2) when C is an encryption of 0 (resp., 1) then the view of 4 
(assuming j E U*) is identical to its view in Hj (resp., -ffj+i). Note further that 

Prj[4 outputs 0\j$ U*] = Pr J+ i [4 outputs 0 | j 0 U*] 

since the challenge ciphertext is distributed identically in each case. It follows 
that 

|Pr[4 7 outputs 0 | C < — EnCp fe (0)] — Pr[4 7 outputs 0 | C <— EnCp fe (l)]| 


|Pr [j E U*] ■ Piy [4 outputs 0 | j E U*] 

— Pr [j E U*] ■ Pr^+i [4 outputs 0 | j E U*] | 


= \Sj-S j+1 \, 


concluding the proof. 
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A Proof Details 

We analyze the success probability of the adversary A from Section 13.11 Due to 
space limitations, the proof cannot be reproduced here in its entirety; we have 
instead aimed to describe those parts of our proof that differ most prominently 
from the proof of Boneh et al. j5j ■ The most significant new element in our proof 
is Proposition d 

Toward analyzing the success probability of A, we describe a series of ex- 
periments, the first of which corresponds to adversary A interacting in the ex- 
periment from Definition El We show that, as long as no “bad” events (to be 
defined later) occur, the statistical distance between the transcripts generated in 
each of these experiments is not too large. This allows us to bound A’s success 
probability by comparing it to an appropriate event in the final experiment. 
Expt 0 : This corresponds to A interacting in the experiment from Definition El 
Expt^ This is the same as Expt 0 except that O" (as defined after the k th repe- 
tition of step 3) is used instead of O to compute the challenge ciphertext C*. 
Expt 2 : This is the same as Exptj except that O" never queries O (cf. step 3 in 
the definition of 0")\ instead, any such queries are answered randomly (subject 
to ensuring that O" remains consistent). 

Expt 3 : This is the following experiment with no adversary and using the real 
oracle O : 

Setup and challenge 

1. Compute f* <- Ai(l"), I* = A 2 ( l n , /*), and {/i, . . . , f p } <- A 3 ( 1", /*). 

2. Choose at random MSK {0, 1}" and compute MPK := Setup °(MSK). 
If fi(I*) = 1 for some i, abort and output a random bit. 

3. For every predicate / G {/*, /i, ■ ■ ■ , fp} compute SKf := KeyGen MSxif)- 
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Step 1: Discovering important public keys. For i = 1 to p do: 

1. Compute //,. <— A 2 ( 1", /,), and choose random bi <— {0, 1} and r, <— {0, 1}”. 

2. Compute Decf^Enc 

Step 2: Decrypting the challenge 

1. Choose r <— {0, 1}" 6 <— {0, 1} and compute C* := Enc^ Pif (I*, 6; r). 

2. Compute b' := Decg K ( C *) and output b' . Note that b' = b always. 

This completes the description of Expt 3 . 

For i G {0, 1, 2} we will be interested in the following transcripts defined in the 
course of Expt, . These transcripts contain, in particular , all oracle queries / answers . 

— trans* etup : The transcript of the setup phase. This includes the computation 
of MPK and SKf t . . . . , SKf p , as well as the computation of SKf- for the 
f* chosen by the adversary. (Even though SKf » is not computed in the 
experiment, SKf • is well defined given /*, MSK, and O.) 

— trans p/ . s : The transcript of step 1 (“discovering important public keys”). 

— trans} re(? : The transcript of step 2 (“discovering frequent queries for /*”). 

— transY m . setup : This is the transcript defined by the adversary’s choice of 
MSK' and O' in the k th repetition of step 3, and can be viewed as the 
adversary’s “guess” for trans( etljp . 

— trans*: The transcript of the encryption of C/decryption of C* in the k th 
repetition of step 3. 

— trans® = (trans'( etlip , trans pfcs , trans( jm . setup , trans(). 

For Expt 3 we define 

— trans^ im . setup : The transcript of the “setup and challenge” step. 

— trans| fcs : The transcript of step 1 (“discovering important public keys”). 

— transj: The transcript of step 2 (“decrypting the challenge”). 

— trans 3 = (trans pfcs , trans 3 j m . setup , trans 3 ). 

For a given transcript, we partition the set of public keys used (i.e., the set of 
pfc’s for which [<?(•) = pk\ € trans) into the following sets: 

— We let Qs (trans) denote the public keys queried during execution of Setup: 

Qs (trans) = f {pk \ the query [$(•) = pk) G trans is asked by Setup}. 

Intuitively, these are the pfc’s whose corresponding sk’s are “useful” for de- 
crypting ciphertexts. 

— We let Qj^(trans) denote the public keys queried by the KeyGen algorithm 
when some personal secret key is derived: 

Qx(trans) = f { pk \ [<?(•) = pk] G trans is asked by KeyGen M sk{')} 
Qk-s (trans) = f Q K (trans) \ (trans). 

— Finally, we will also look at the public keys “discovered” during encryption 
and decryption (cf. step 3 of the experiments): 

Q.ENC+DEc(trar\s, J, /) = f {pk \ [g(-) = pk] asked by Decsir / (EncMPir(/, •; •))} 


On Black-Box Constructions of Predicate Encryptk 


213 


A.l Bounding Probabilities of Bad Events 

Fixing the master secret key MSK and the oracle O (this fixes MPK as well 
as {SKf} fer), we define four “bad” events and bound the probabilities of each 
of them. Here, we will only describe and bound one of these events; we refer to 
the full version of our paper for the remainder of the proof. 

Let E' nc be the event that either of the following is true (in Expt,): 

1 . 3/f 6 {/i, • ■ • ,f p } such that fi(I*) = 1 . 

2. The following condition holds: 

QENC+DEc(transl , I*, /*) Q Qs(transb ro . setup ) 

£ (J Q E NC+DEc(transi ks ,I f ,f)\f > \Qs(transi im . setup ), 


where If := A 2 (l n ,f). 

Intuitively, the second condition above is the event that the public keys that are 
“useful” for /i, . . . , / p does not contain the public keys that are “useful” for /*. 

We bound the probability of E'% c using the assumed easily-covered property 
of {(^-"n, A„)}; this is the crux of our proof, and is what motivates Definition 0 

Proposition 1. Pr[£^ c ] < 1/5. 

Proof. Fix O and MSK e {0, 1}", thus fixing trans;( lm _ seiup . If for each / e T n 
we fix a random tape rj that is sufficiently long to run Decg^ (EncMPif {I, b: r)) 
(where I = f A 2 (/)), then this defines, for each /, the set 


Sf 

= f {pk I [ff(-) =pk) asked by Dec Sif/ (Eocmpk {I, b] r))| n Qs{irans 3 sim . setup ). 

Numbering the (at most q) public keys in Qs(M ans sim- setup) i n lexicographic 
order, we can view these as an ,F„-set system over [ 5 ]. The fact that 

{(.F n ,A n )} can be (/-covered implies that there exists a polynomial p such that 


Pr 


VfeE n :r f ^{0,iy 

f*^AuI* ■■= A 2 (l n ,f*) 


Sf*ZU s f}jf\(Vi:fi(n = 0) 


The above is a lower bound on the probability that E% c does not occur. □ 
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Abstract. This paper presents a hierarchical predicate encryption 
(HPE) scheme for inner-product predicates that is secure (selectively 
attribute-hiding) in the standard model under new assumptions. These 
assumptions are non-interactive and of fixed size in the number of adver- 
sary’s queries (i.e., not “g-type” ) , and are proven to hold in the generic 
model. To the best of our knowledge, this is the first HPE (or dele- 
gatable PE) scheme for inner-product predicates that is secure in the 
standard model. The underlying techniques of our result are based on a 
new approach on bilinear pairings, which is extended from bilinear pair- 
ing groups over linear spaces. They are quite different from the existing 
techniques and may be of independent interest. 


1 Introduction 

1.1 Background 

The notion of predicate encryption (PE) was explicitly presented by Katz, Sahai 
and Waters m as a generalized (fine-grained) notion of encryption that covers 
identity-based encryption (IBE) [2I3I5I9I1 11115) , hidden- vector encryption (HVE) 
0 and attribute-based encryption (ABE) fill 311 9121)121 . 

Informally, secret keys in a predicate encryption scheme correspond to predi- 
cates in some class T, and a sender associates a ciphertext with an attribute in 
a set a ciphertext associated with the attribute I £ £ can be decrypted by 
secret key sky corresponding to the predicate / € T if and only if f(I) = 1. 

In addition, a stronger security notion for PE, attribute-hiding, than basic 
security requirement, payload-hiding, was defined in [Ej. Roughly speaking, 
attribute-hiding requires that a ciphertext conceal the associated attribute as 
well as the plaintext, while pay load- hiding only requires that a ciphertext con- 
ceal the plaintext. If attributes are identities, i.e., PE is IBE, attribute hiding 
PE implies anonymous IBE. 

Katz, Sahai and Waters [E! also presented a concrete construction of PE for 
a class of predicates called inner-product predicates, which represents a wide 
class of predicates that includes an equality test (for IBE and HVE), disjunc- 
tions or conjunctions of equality tests, and, more generally, arbitrary CNF or 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 214 |2.3 I J 2009. 
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DNF formulas (for ABE). Informally, an attribute of inner-product predicates 
is expressed as vector ~x and predicate f-$ is associated with vector ~v, where 
/^>( a?) = 1 iff ~x ■ ~v =0. (Here, ~x ■ ~v denotes the standard inner-product.) 

Although the Katz-Sahai- Waters scheme m is the most expressive attribute- 
hiding PE among the existing schemes, no delegation functionality was proposed. 
Shi and Waters m presented a delegation mechanism for a class of PE, but the 
admissible predicates of the system, which is a class of equality tests for HYE, 
are more limited than inner-product predicates in m- Okamoto and Takashima 
m presented hierarchical delegation of PE for inner-product predicates, but the 
security proof was only given in the generic model. 

1.2 Our Results 

This paper addresses the above problems in [1112211 8'lj . 

— This paper proposes a hierarchical predicate encryption (HPE) scheme for 
inner-product predicates, where a (natural) hierarchical delegation system 
of inner-product predicates is provided e.g., our hierarchical system is con- 
sistent with that for hierarchical IBE (HIBE) [41811 H12j (i.e., our HPE is 
specialized to anonymous HIBE, if the predicate of HPE is specified to the 
equality test of identities). 

— The proposed HPE scheme is selectively attribute-hiding against chosen- 
plaintext-attacks (CPA) in the standard model under two new assumptions, 
the RDSP and IDSP assumptions. These assumptions are non-interactive, 
falsifiable and of fixed size in the number of adversary’s queries (i.e., not 
“f?-type”), and are proven to hold in the generic model. 

— To achieve the result, this paper advances an approach recently developed in 
[1 711 8j . This approach is extended from bilinear pairing groups into higher 
dimensional vector spaces, and a notion, dual pairing vector spaces (DPVS), 
is employed in this paper. (We will explain this approach below.) 

One of the most basic decisional assumptions in this approach is the de- 
cisional subspace problem (DSP) assumption. (It is a higher-dimensional 
generalization of the decisional DH and Linear assumptions, and the rela- 
tionships of this assumption with the traditional ones are studied in m 
The assumptions introduced in this paper, the RDSP and IDSP assump- 
tions, are variants of the DSP assumption in DPVS. 

— The performance of the proposed HPE scheme is almost the same as (or 
slightly worse than) that in [E| , where the dimension of DPVS for our HPE 
scheme is n + 3, whereas that for [IB] is n + 2, when n is the dimension of 
predicate/attribute vectors. 

— Since HPE is a generalized (fine-grained) version of anonymous HIBE 
(AHIBE) (or includes AHIBE as a special case), HPE covers (a generalized 
version of) applications described in jSj, fully private communication and 
search on encrypted data. For example, we can use a two-level HPE scheme 
where the first level corresponds to the predicate/attribute of (single-layer) 
PE and the second level corresponds to those of “attribute search by a pred- 
icate” (generalized “key- word search”). 
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1.3 A New Approach Dual Pairing Vector Spaces 

We now explain how the approach works by using a typical construction example 
on direct products of pairing groups (q, Gi, G2, Gt, ffi, 52, 9T,e), where q is a 
prime, Gi, G2 and Gt are cyclic groups of order q, gi is a generator of G,; 
(i = 1,2), e : Gi X G2 — > Gt is a non-degenerate bilinear pairing operation, 
and qt := 6(51,52) 7^ 1. Here we denote the group operation of Gi, G2 and Gt 
by multiplication. Note that this construction also works on symmetric pairing 
groups, where Gi = G2. As for the definitions of some notations, see Section II GI 
N N 

Vector spaces V and V*: V := Gi x x Gi and V* := G 2 x • • • x G 2 , 
whose elements are expressed by IV-dimensional vectors, x := (g^ 1 , ■ ■ ■ ■ 9 i n ) 
and y := (g^ 1 9 2 N ), respectively (xi, yi G F ? for i = 1, . . . , N). 
Canonical bases A and A*: A := (ai, . . . , ajv) ofV, whereai := (<71, 1, . . . , 1), 
a 2 := (l,5i, 1, • • • , 1), • • ■ , a N := (1, . . . , 1, 51). A* := (a\,. . . ,a* N ) ofV*, 
where := (52, 1, . . . , 1), a * 2 := (1,52,1, ■ • • , 1), • • • , a* N := (1, . . . ,1,52).^ 
Pairing operation: e(x,y) := Jlili e ( 9 i*> 9 %) — 52)^-'*= 1 XiVi = 9 t v € 

Gt for the above x GY and y G V*. 

Base change: Canonical basis A is changed to basis B := ( b\ . ... . b/v) of V using 
a uniformly chosen (regular) linear transformation, A := (xi,j) *— GL(N, F 9 ), 
such that bi = X^-=i Xi,j a ji (i = 1, ■ • . , N). A* is also changed to basis B* := 
(b$,...,b* N ) of V*, such that (-dijj := (X T )~\ b* = Ef = 1 0 ija], (i = 
1, . . . , N). We see that e(bi,b 1 j) = g%'\ {Sij = 1 if i = j, and Sij = 0 if 
i j ) i.e., B and B* are dual orthonormal bases of V and V*. 

Intractable Problem: One of the most natural decisional problems in our 
approach is the decisional subspace problem (DSP) HZj. The DSP(jv 1 ,jv a ) 
assumption is: it is hard to tell v := VN 2 +ibN s +l 4- '• ■ • + Vn^n 1 from u := 
vibi + • • • + VN 1 bN 1 , where (ui, . . . , vw t ) ^ F^ 1 and N 2 + 1 < N\. DSP is 
intractable if the generalized DDH or DLIN problem is intractable E, 
Trapdoor: Although the DSP problem is assumed to be intractable, it can 
be efficiently solved by using trapdoor t* G span (b*, . . . , b* N2 ). Given v := 

Ujv 2 +ibjv 2 +i + k WjVibjVi or u := v\b\ -| + Vm 1 b.v, , we can tell v from 

u using t* since e(v. t*) = 1 and e(u,t*) ^ 1 with high probability. 


1.4 Related Works on Our Approach 

Higher dimensional vector treatment of bilinear pairing groups have been already 
employed in the literature especiahy in the areas of IBE, ABE and BE (e.g., 
For example, in a typical vector treatment, two vector forms 
of P := (5'j 1 , . . . ,5i n ) and Q := (g’j 1 , . . . ,g 2 n ) are set and pairing for P and Q 
is operated as e(P, Q) := n”=i e ( 9 i i > 92 *)- Such a treatment can be rephrased in 
our approach using the (symmetric pairing) notations shown in Section fPl such 

that P = x\a\ + ■ ■ ■ + x n a n and Q = y\a\ H + ?/ n a* over canonical basis A 

and A*. 
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The major drawback of this approach is the easily decomposable property over 
A (and A*). That is, it is easy to decompose Xid, = (1, . . . , 1, g* 1 2 3 , 1, . . . , 1) from 
P := zim H x n a n = (g* 1 , . . . .jf”). 

In contrast, the current approach employs basis B that is linearly transformed 
from A using a secret random matrix X £ F^ x ". A remarkable property over B 
is that it seems hard to decompose Xibi from P' := x\b\ + ■ ■ ■ x n b n . In addition, 
the dual orthonormal basis B* of V* can be used as a source of the trapdoors to 
the decomposability (see Section II .dll through the pairing operation over B and 
B*. The hard decomposability and its trapdoors are the key trick in this paper. 
Note that composite order pairing groups are often employed with similar tricks, 
hard decomposability of a composite order group into the prime order subgroups 
and its trapdoors through factoring (e.g., jlBI22j h 

1.5 Notations 

When A is a random variable or distribution, y A denotes that y is randomly 
selected from A according to its distribution. When A is a set, y ^ A denotes 
that y is uniformly selected from A. y := z denotes that y is set, defined or 
substituted by z. When a is a fixed value, A(x) — > a (e.g., A(x) — > 1) denotes 
the event that machine (algorithm) A outputs a on input x. A function / : N — > R 
is negligible in A, if for every constant c > 0, there exists an integer n such that 
/(A) < A -c for all A > n. 

We denote the finite field of order q by ¥ q . A vector symbol denotes a vector 
representation over F 9 , e.g., ~x denotes (iff, .... . . , x n ) £ F”. ~x ■ ~v denotes the 

inner-product x i v % of two vectors x = {xi, ... , x n ) and ~v = (»i, . . . , v n ). 
X T denotes the transpose of matrix X . A bold face letter denotes an element 
of vector space V (resp. V*), e.g., x £ V (resp. x* £ V*). span(6i, . . . , b n ) 
(resp. span("xi , . . . , ~x n )) denotes the subspace generated by b±,...,b n (resp. 


2 Dual Pairing Vector Spaces 

Definition 1. “Dual pairing vector spaces (DPVS)” (q. V, V*, Gt- A, A*) are a 
tuple of a prime q, two N- dimensional vector spaces V and V* over ¥ q , a cyclic 
group Gt of order q, and their canonical bases i.e., A := (m, . . . , ajv) of V and 
A* := (aj, . . . ,a^) o/V* that satisfy the following conditions: 

1. [Non-degenerate bilinear pairing] There exists a polynomial-time computable 
nondegenerate bilinear pairing e : V x Y* — > Gt i.e., e(sx,ty ) = e(x,y) st 
and if e(x , y) = 1 for all y £ V, then x = 0. 

2. [Dual orthonormal bases] A, A*, and e satisfy e(a,i,a*) = g f.' 1 for all i and 
j, where Sij = 1 if i = j, and 0 otherwise, and gr 7 ^ 1 € Gt- 

3. [Distortion maps] Endomorphisms fyj ofV s.t. = a, and (f>i t j(ak) = 

0 */ k ^ j are polynomial-time computable. Moreover, endomorphisms <f>T 
ofY* s.t. <f>*j(a,j) = a* and = 0 if k ^ j are also polynomial-time 

computable. We call cjiij and (j) *j “distortion maps”. 
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Three typical constructions are given in H3; a product of bilinear pairing groups, 
or a Jacobian variety of a super singular curve of genus > 1 j23|- See Section 1 1 .31 
as well (where the description of distortion maps is omitted). 

3 Assumptions 

This section defines two variants of the DSP assumption, the RDSP and IDSP 
assumptions. An intuition behind these assumptions are given in Remark below. 

DPVS generation algorithm £d pvs takes input 1 A (A e N) and N € N, and 
outputs a description of param := (q, V,V*,Gt, A, A*) with security parameter 
A and A^-dimensional V and V*. It can be constructed in a manner shown in ra- 
We describe a random orthonormal basis generator Q a b below, which is used as 
a subroutine in the RDSP and IDSP instance generators. 

5 0 b(l\ N) : param := (q, V, V*, G T , A, A*) 4 e dpvs (l A , N), 

X := - GL(N,¥ q ), := (X T )-\ 

bi := Ef=i Xijaj, B := (b u . . . , 6*), K := £f =1 B* := {b \, . . . , 6*,), 

return (param, B,B*) 

We now define the RDSP and IDSP instance generators, t/5 DSP and f?L DSP . 
^ DSP (1 A , n) : (param, B, B*) 4 g ob ( 1 A , n + 3), V := (Vl, ■ • • , Vn) - F # " \ {0 }, 
Si, ^2; Cl) C2 *— d n+ i := b n+ i + b n+ 2, B := (61, d n+ i, & n+3 ), 

( W W, 7 f ) I 7P) fc =i,2,3^GA(F ? ,3), 

For * = 1, . . . ,n; Ar = 1,2,3; 

:= ^b* + ^ yi b* n+1 + ^y iK+ z, rf > := (# + 
eo := <5l(E"=l 2/i b i) + $2^+3, 

e l : = y + Clbn+I + C2&n+2 + <^2^+3, 

return (param, B, {h[®* , ^=1,2,3, V, e/j). 

^ DSP (l\n) : (param,B,B*)^£ ob (l A ,n + 3), 

V := (Ul ^F g "\{'0}, it := U n ) +H f* \ {!?}, 

^i) ^2> Cl) C2 ^ F 9 , d n+1 := b n+ i + b n+ 2, B := (61, . . . , 6„, d n+ i, 6 n+ 3), 

For i = 1,. . . ,n; (w^ff^SVw ^ GL( F„ 3), 

For i = 1, . . . ,n; Ar = 1,2,3; 

h< fc > := c v^b* + 7 g>6J +1 + 7 g ) 6; +2 , := 7 g> + 7$, 

eo := Jl(E"=l Vibi) + (lbn+t + C2&ra+2 + ^2^71+3) 
e l := *{£2* u ibi) + Cl^n+1 + C2 b n +2 + ^2&n+3) 
return (param, B, t/ fc ^}i=i,..., T i;fc=i,2 > 3, ~V , &p)- 
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Definition 2 (RDSP: Decisional Subspace Problem with Relevant 
Dual Vector Tuples). For all security parameter A £ N, we define RDSP 
advantage of a probabilistic machine B as follows: 

Advg DSP (A) := 

| Pr [s(l A ,pHl| p4s 0 RDSP (l\n)]— Pr[l3(l\pHl| p 4s RDSP (l\ «)] | . 

The RDSP assumption is: for any probabilistic polynomial-time adversary B, 
Advg DSP (A) is negligible in A. 

Definition 3 (IDSP: Decisional Subspace Problem with Irrelevant 
Dual Vector Tuples). The IDSP advantage of B, Advg SP (A), and the IDSP 
assumption are defined similarly as in Definition OJ 

In the generic DPVS model, basic operations in V, V*, and Gt, i.e., vector ad- 
ditions in V and V*, multiplication in Gt, pairing, and distortion maps w.r.t. A 
or A*, are given by “generic” algorithms that act independently of the represen- 
tations of vectors or group elements. 

Theorem 1. The advantages Advg DSP (A) and Advg SP (A) are 0{d/2 x ) for any 
adversary B in the generic DPVS model, where d is the maximum of the degrees 
of polynomials of formal variables (in the generic model game). 

We will describe the proof of Theorem 0 in the full version of this paper. 

Remark (Intuition behind the Assumptions) 

Here we informally explain the RDSP assumption by using a simplified one. 
In the simplified RDSP assumption, (hi , . . . , h*) is given to A in addition 
to (B := (&i, . . . , b n+ z), ~y := (yi , . . . , y n ), ep), such that h\ := ub* + yib „ +1 
(i = l,...,n; u F g ) and ep := <5i(£"=i 2/»&i) + PCK+i + S 2 K +2 (S ^ {0,1}, 
Si, 61 , C Fq). The simplified RDSP assumption is that it is hard for any adver- 
sary A, given (B, ~y , ep) along with (h \, . . . , h*), to correctly guess (3. (In the 
DSP assumption, only (B, ~y, ep) is given to A.) 

(hi , h* n ) is added in the RDSP assumption in order to simulate the key 
generation oracle in the security proof of our encryption scheme as follows: for 
any v := (i’i , ■ ■ ■ , v n ) with v ■ y f 0, the simulator can compute a secret 
key k* for if such that k* := =*{=* Y^=i v i^i = TTy r (Yli=i v ibl) + b * n+ 1 = 
^'(ELi v iK) + K+i where J := =£=r. 

This secret key generation procedure, however, does not work for if with 
if ■ ~y =0, since =*l=r cannot be computed. Therefore, (h\, . . . , /i*) does not 
seem helpful to break the RDSP assumption, since a secret-key k* for if with 
“if ■ ~y =0” is of use to guess j3 by checking whether e(ep, k*) = 1 or not. 
Hence, the RDSP assumption seems to hold if the DSP assumption does. 

Similarly the IDSP assumption is introduced as a variant of the DSP assump- 
tion. In the RDSP and IDSP assumptions employed in this paper, we use a 
public element d n +i := b n+ 1 + b n+ 2 (in place of b n+ 1 in basis B in the simplified 
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one), and b n+ 1 and b n+ 2 are not published. Such a modification is required for 
the IDSP assumption since the simplified IDSP assumption does not hold. 

In addition, in our RDSP (and IDSP) assumption, fc=i, 2,3 

is employed in place of This modification is introduced to re- 

randomize the coefficients for each key generation of the simulation by a random 
hnear combination of h\^* , h^* and h[ 3 ^* . 

4 Definition of Hierarchical Predicate Encryption (HPE) 

This section defines hierarchical predicate encryption (HPE) for the class of 
hierarchical inner-product predicates and its security!] 

In a delegation system, it is required that a user who has a capability can dele- 
gate to another user a more restrictive capability. In addition to this requirement, 
our hierarchical inner-product encryption introduces a format of hierarchy ft to 
define common delegation structure in a system. 

We call a tuple of positive integers ~jt := (n, d; /x 1 , . . . , fid) s.t. po = 0 < /xi < 
P 2 < • • • < Pd = n a format of hierarchy of depth d attribute spaces. Let £( 
(£ = 1, . . . ,d) be the sets of attributes, where each Sf '■= \ { 0 }. 

Let the hierarchical attributes £ := U$ =1 (I7i X . . . X St), where the union is 
a disjoint union. Then, for v t G 1 \ { 0 } , the hierarchical predicate 

fpu i,...,V f ) on hierarchical attributes ("a?i, . . . ,!?/,) G S is defined as follows: 
fcvi ,.. .,"«/)("* 1 , ■ • ■ i ~^h) = 1 iff £ < h and ~x x ■ = 0 for all i s.t. 1 < i < L 

Let the space of hierarchical predicates T := \ ~v i € \ 

{ 0 }}. We call h (respT) the level of (~xi , . . . ,~Xh) (resp. (T?i, . . . ,~v e))- 

Definition 4. Let ~jl := (n, d\ . . . , /x<i) s.t. /xo = 0 < pi < /Z 2 < • • • < l^d = n 
be a format of hierarchy of depth d attribute spaces. A hierarchical predicate 
encryption (HPE) scheme for the class of hierarchical inner-product predicates 
T over the set of hierarchical attributes S consists of probabilistic polynomial- 
time algorithms Setup, GenKey, Enc, Dec, and Delegate^ for 1=1,..., d— 1. They 
are given as follows: 

— Setup takes as input security parameter 1 A and format of hierarchy ft , and 
outputs (master) public key pk and (master) secret key sk. 

— GenKey takes as input the master public key pk, secret key sk, and predicate 
vectors ( v 1 , . . . ,vf). It outputs a corresponding secret key sk 

— Enc takes as input the master public key pk, attribute vectors ("a? x, . . . , x jJ, 
where 1 < h < d, and plaintext m in some associated plaintext space, msg. 
It returns ciphertext c. 

— Dec takes as input the master public key pk, secret key sk (^ lj where 
1 < i < d, and ciphertext c. It outputs either plaintext m or the distinguished 
symbol _L. 

1 More general delegation structures (partial order structures) than tree hierarchical 
structures can be easily realized in our HPE scheme. See Remark in Section 0 
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— Delegate^ takes as input the master public key pk ; £-th level secret key 
s k(Ti,...,T«)> an d (1 + 1 )-th level predicate vector ~v i+i- It returns (£ + 1 )-th 
level secret key sk^ 1) .. 

A HPE scheme should have the following correctness property: for all cor- 
rectly generated pk and sk^ generate c Enc(pk, m, (l?i, . . . ,1c h)) 

and m! := Dec(pk, sk(^ lj ^),c). If . . . ,1c h) = 1, then nn! = m. 

Otherwise, m! ^ m except for negligible probability. 

For / and f in T , we denote f < f if the predicate vector for / is a prefix 
of that for /'. For the following definition for key queries, see m- 
Remark: We will explain the hierarchical structure by using a small (toy) 
example that has three levels and each level consists of 2-dimensional space, 

i.e., 6-dimensional space is employed in total. That is, ~j2 := (n, d: pi , . . . , p,i) 
= (6, 3; 2, 4, 6) in this example. 

A user who possesses a secret key ski in the top level, associated with the 
top level predicate vector v i := , w 2 ), can delegate any value (say v 2 := 

(«3 A’4)) of the second level key sk 2 such that the predicate vector for sk 2 is 
(Tfl,Tf 2 )- Similarly, a user who possesses a secret key in the second level, sk 2 
with (Tfi ,T? 2 ), can delegate any value (say ~v 3 := (v 5 . u 6 )) of the third level key 
sk 3 with (l/i ,^2,^3)- 

Secret key ski with T? can decrypt a ciphertext associated with attribute 
vector (1? 1, (*, *), (*, *)) := ((aq, x%), (*, *), (*, *)) if ~x\ ■ Vi = 0, where * de- 
notes an arbitrary value. Secret key sk 2 with (v 1, v 2 ) can decrypt a ciphertext 
with attribute vector (a?i,"a? 2 , (*,*)) if ~x \ ■ W% = 0 and lr 2 • if 2 = 0. However 
sk 2 cannot decrypt a ciphertext with higher level (top level) attribute vector 
~X \ := (X1.X2) (or fa^i, (*,*), (*,*))). Therefore, the capability of a delegated 
key sk 2 is more limited than the parent key ski. 

Hence, when ( u 1, u 2 ) := ((ui,u 2 ), (^'3, W4)) is a predicate vector for a secret 
key, (if i,~V2) is considered to be (T?i,Tf 2 ,(0, 0)), and when ifi := (aq,x 2 ) is 
an attribute vector for a ciphertext, 1c 1 is considered to be (if 1, (*,*), (*,*))), 
where (*,*) • (0,0) = 0 and (*,*)• it 2 ^ 0 unless it 2 = (0,0). 

Definition 5. A hierarchical inner-product predicate encryption scheme for hi- 
erarchical predicates T over hierarchical attributes S is selectively attribute-hiding 
(AH) against chosen plaintext attacks if for all probabilistic polynomial-time ad- 
versaries A, the advantage of A in the following experiment is negligible in the 
security parameter. 

1. A outputs challenge attribute vectors X := ifx . . . , a?^), X^ := 

C^P, •••.?$>). 

2. Setup is run to generate keys pk and sk, and pk is given to A. 

3. A may adaptively makes a polynomial number of queries of the following 
type: 

— [ Create key ] A asks the challenger to create a secret key for a predicate 
f £ T . The challenger creates a key for f without giving it to A. 


222 T. Okamoto and K. Takashima 


— [ Create delegated key ] A specifies a key for predicate f that has already 
been created, and asks the challenger to perform a delegation operation 
to create a child key for f < f. The challenger computes the child key 
without giving it to the adversary. 

— [ Reveal key ] A asks the challenger to reveal an already- created key for 
predicate f s.t. f{X = /(A^ 1 )) = 0 . 

Note that when key creation requests are made, A does not automatically see 
the created key. A sees a key only when it makes a reveal key query. 

4- A outputs challenge plaintexts 

5. A random bit b is chosen. A is given c ^ Enc(pk, m^ b \ X^). 

6. The adversary may continue to request keys for additional predicate vectors 
subject to the restrictions given in step 3. 

7 . A outputs a bit b', and succeeds ifb’ = b. 

We define the advantage of A as the quantity Adv^ PE,AH (A) := |Pr [ 6 ' = 6 ] — 1 / 2 |. 

Remark: In Definition 0 adversary A is not allowed to ask a key-query for 
(T?i, . . . , 1 1 f) such that (X^) = 1 for some b € {0, 1}, while in the 

security definition in m , such a key-query is allowed provided that = rrif 1 ^ 
and /( -rfi){.XW) = f(^ 1 ,...,if l )(X^) = 1 . This restriction is introduced to 
prove the security of the proposed HPE scheme only under the RDSP and IDSP 
assumptions. If we introduce another variant of the assumptions, we can relax 
this restriction. We will describe this case in the full version of this paper. 

5 The Proposed HPE Scheme 

5.1 Key Idea in Constructing the Proposed HPE 

We will explain a key idea of the proposed HPE scheme. 

First, as a special ( 1 -level) case of the proposed construction of HPE, we will 
show a predicate encryption (PE) construction for the inner-product predicate. 
Through the orthonormal property of (random) dual bases (B := (fq, . . . , b n+ 3), 
B* := (b * : . . . , &* +3 )) in DPVS, (q, V, V*, Gt, A, A*), ( Sections It . .11 El and HI) . the 
PE scheme for the (n-dimensional) inner-product predicate can be constructed 
as below, where V and Y* are (n + 3 )-dimensional spaces, the public parameter 
is (61, . . . , b n . d n+ 1 := b n+l + b n+ 2, b n+ 3) as well as the parameters of DPVS, 
and the master secret key is ( X and) B*. Ciphertext (01,02) for attribute ~x := 
(aq, . . . , x n ) £ and plaintext m £ Gt is c\ := ^(aqbi + • • • + x n b n ) + 
C,d n+ \-\-8A> n+ s and ci := g^rri, where <h, 82, ( ^ F g . Secret key k* with predicate 

Tf := (wi, is k* := cr(ni6iH \-v n b* n )-\-rib* n+1 -\-{l-rj)b^ +2 , where 

u, rj F g . If ~x -~v = 0 , plaintext m can be computed by m = oi/e{c\, k*), since 
e(ci,fe*) = {H^ 1 e(8 1 x i bi,av i b*)) ■ e{C,b n+1 ,r]b* n+1 ) ■ e(C,b n+ 2 ,(1 - rj)b* +2 ) = 

We now explain the key idea of the proposed HPE scheme by using a small 
(toy) example. Let the dimension of (predicate/attribute) vectors be 6, in which 
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there are three levels and each level has 2 -dimensions, V and V* be 9 -dimensional 
spaces, the public parameter be B := (b| , . . . , b 6 , d 7 . fag) as well as the parameters 
of DPVS, and the master secret key be ( X and) B* := (fa*, ... , fag), where d 7 ■= 
b 7 + bg. 

Ciphertext (01,02) for attribute x := (ji,r2, a; 3) := ((*1, X2), (xg, 0:4), 

(x5,xe)) G F® and plaintext m is constructed as c\ := <Ji(xibi + a;2&2) + h 

dg{xgbg + xe be) + + 5 ±bg and C2 := g^m, where < 5 i, . . . , 84, ( ^ F g . If the 

attribute is a higher level such as x 1 := ($1, £2), generate a modified attribute 
:= ((x 1 ,x 2 ),(xj ,xt),(x£ ,x£)), where (x+, x£, x+, x£) ^ F 4 . Then, ci- 
phertext ci for attribute 2; 1 is computed as ciphertext ci for the modified 
attribute x + . 

Top level secret key k\ := (fe^ 0l ■ ■ • , k* e ), for predicate ~v := (v\ . v 2 ) G F^ 
consists of three parts, fe* 0 , (fe) , k* t 2 ) and (fe* 3 , . . . , fe* 6 ), where the first one 
is used for decryption of ciphertexts, the second one for re-randomization (of 
delegated key), and the last one for delegation. Each part is: k* 0 := fa) + 

vzbl) + r]ob 7 + (1 - rj 0 )bg, k{ d := cri d (vi fa* + v 2 b* 2 ) + rjjb 7 - r)jb% ( j = 1 , 2 ), and 
fe* • := a\ d {vib\+v 2 b^)+ij)bj+r]jb 7 — rjjbg (j = 3 , . . . , 6 ), where (Ti d ,ip F g for 
j = 0 , . . . , 6 . The first one, fe* 0 , can decrypt ciphertext (ci, C2) by c 2 /e(ci, kl 0 ), 
since e(ci, fe* 0 ) = g ^ if an attribute of ci is ((x\,x 2 ), (*,*),(*,*)) with (x%, x 2 ) ■ 
(vj , v 2 ) = 0 . To delegate a secret key for the 2 nd level vector (i>3, V4), cr 2d {vgk* 3 + 
^4^1, 4) i s added to fcj 0 ( j = 0 ), 0 ( j = 1 , 2 , 3 ), and ^ + k\ d ( j = 5 , 6 ). To re- 
randomize the coefficients of (tqb* + v 2 b%), b 7 and fag in the delegated key, 
(a^ife* ! + %,2fe*,2) is al so added. So, the delegated key (the second level key) 
fc 2 := (fef 0 , ■ ■ ■ , 3) fe2 5> fe2 6 )> (where fe| 0 is for decryption, (fe 3 1; . . . , fe| 3 ) 

for re-randomization, and (fcj^fejjg) f° r delegation) is computed as fe| 0 := 

^l,0 + ( a 0 ,lfe*,l + Q! 0 1 2 fe* j 2 ) + O' 2 , 0 .(® 3 fe*,#. , l*« 4 fe* i 4 ), fc^j := (p^lfe*^ + Qy, 2 fel, 2 ) + 

o~2,j{vgk* 3 + u 4 fe* |4 ) ( j = 1,2,3), and kt,j := ip + K,j + («i,ife*,i + 2^1,2) + 

a?y(®3fe*3+«4fci,4) (j = 5 , 6 ), where ayi, otj, 2 , a 2i j,^ + ^ F 9 (j = 0 , 1 , 2 , 3 , 5 , 6 ). 
Then, the distribution of the delegated key (by Delegate) is equivalent to that 
obtained by the key generation query (GenKey) except negligible probability 
(i.e., the simulation of ‘create delegated key query’ can be equivalent to that of 
‘create key query’.) 

In general, as for the £-th level secret key, fe \ := (fc| 0 , . . . , fe| + 1, . . . , 

fe|„), the first one, fe| 0 , is used for decryption, the second part of components, 
fc| 1 , . . . , fc| f+1 , are for re-randomization (of a delegated key), and the last part 
of components, k \ +1 , ■ ■ • , fe| ra , are for delegation. 

5.2 HPE Scheme 

Setup(l A , ~jt := (n,d;g,i, . . . ,na)) : (param,B,B*) ^ £ ob (l A ,n-|- 3 ), 
d n . )_i := b„+i + fan+ 2 ) B := (fai, . . . , b n , d n _|_ 1, b n +3), 
return sk:=(X,B*), pk := ( 1 A , param, B). 
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GenKey(pk,sk, (v i, . . . ,~v e) ■= ((«i, • • • ,« w )> • • • > ( w m-i+i> ■ • • »«#«)) '■ 

°i,i> Vj for j = 0, . . . ,£+ l,fit + 1, . . . ,n; i= 1,...,£, 

K,0 '■= Et= 1 CT 0,t(Er= M .-i+l + % b n+l + (1 - no)K+ 2) 

K, 0 ■= Es= i ^(EfEt-i+i _ 2 

for j = 1 ,...,£+ 1 , 

k lj ~ Et=l ^(EfE.-1+l + ^ b *j + fo b n+l _ fo' b n+2 

for j = fj,£ + 1, — , n, 

return fc| := (fe| i0 , . . . , fc^ +1 , fcl, w+1 , ■ • • , &!,„)■ 

Enc(pk, m £ Gr, (1? i, . ■ ■ , ~x t) ~ : 

4l«+ 1 "'“ x • •• x $i,. .. ,S d ,6 n+3 X 

Cl := E{«1 ^(EfE.-1+i Xibi ) + ^ dn+1 + S n+3b n +3, c 2 := gfcm, 
return (01,02). 

Dec(pk, fc| 0 ,ci,C2) : m' := c 2 /e(ci, fc| 0 ), 

return m'. 

Delegate^(pk, fc|,l£e +1 := (v^+i, . . • ,v w+1 )) : 

^Fg for j = 0, ...,£+ 2,nt+i + 1, ... ,n; i = 1, ...,£+ 1, 
fc ?+i,o : = k to + E£i «o,ifeJ l4 + ffo(E££+i ««*?,*)» 
fe l+l,i : = ES <*?,<*«,< + %(Ef=w+i *><*/,*) for j = 1, ...,£+ 2, 
k e+i ,j ■= E<ii a i,i k e,i + tJ j(E ^+ i k t,j for j = M4+1+1, . . . ,n, 

return fc| +1 := (fe| +10 , . . . , k e+i t e+ 2 , ^?+i, w+1 +n ■ • • > k e+i, n )- 

[Correctness] Assume that ciphertext (01,02) is generated by Enc(pk, m, (1? i, 
...,~Xh)) and secret key fe| 0 is generated by GenKey(pk, sk, (l?i, . . . ,lf|)). Note 

that e(ci, fe| 0 ) = x ' * _I E If £ < h and ~Xi ■ ~v i = 0 for all i s.t. 1 < 

i < £, then e(ci,fej 0 ) = g^. Otherwise, e(ci,fej 0 ) is uniformly distributed. 
Hence, correctness holds for secret keys generated by Gen Key, and it also holds 
for keys generated by Delegate by Claim d 

Remark: A generalized delegation (not limited to a hierarchical delegation) 
system can be constructed on (1-level) PE described in the first part of Section 
lft.ll where the parameters are the same as above. 

In the generalized delegatable PE scheme, secret key generation procedure 
GenKey(pk, sk, l?i := (u^i, . . . ,ui,«)) outputs k\ := (ft* dec , ranl , k\ ran 2 , 

*U,del,l> ■ ■ ■ . *U,del,n)> where fe l,dec : = ^decEiU Vl,i b i)+Vd6cK, +1 + {l-Vdec)K+2i 

fe l,ra B.J : = °Van,j(E"=l w M b i) + VwjK+l ~ T)ra<i,jK+2 U = 1 , 2 ); fc l,del,j : = 
fJ del,j(Er=l V U b i) +‘ l l ;b *j + Vde\,jK+l ~ Vde\jK + 2 U = 1 

To delegate secret key fe * for ~v 2 := (i>2,i, . . . , v 2>n ), where ~v 2 $. span(T’i), 
Delegate! (pk, k *, ~V 2 ) outputs ~ (^2 dec > ^2, ran 1) k 2, ran, 2) ^2 del 1> 
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fc 2,del,J- Here > ^2, dec : = fe l,dec + S’Ll “dec.tfcl.nm.f + ^.decd^j, %>»&*, del,i) 5 

fe 2,ran,j : = Ei=l a ™^ k tr a n,z + CT 2,ran W 2,i*a ide |,i) C? = K delJ ~ 

Ei=l a deU fe*,ran,i + ^deyCEi*-! f2,ifei,del,i) + ^ / ^*,del,i C? = 1 > n )• Further 

delegation for k \ (£ = 2,3,...) can be done in the same manner. 

Ciphertext ( 01 , 02 ) for attribute x := (xi , . . . . x rl ) and plaintext m G Qt is 
the same as that of the 1-level PE. Key k * can decrypt (ci, C 2 ) if ~v \ ■ ~x =0, 
and key k \ can decrypt (ci . C 2 ) if (E i • ~x = 0) A (fv 2 ■ ~x =0). Namely the 
capability of delegated key fcj is more limited than that of its parent key k *. 
In general, the i - th delegated secret key k | can decrypt ( 01 , 02 ) if (~?i • ~x = 
0) A • • • A (!) e ■ ~x = 0), where !?j $ span(l?'i, . . . ,~v j- 1 } for 2 < j < £. 

5.3 Security 

Theorem 2. The proposed HPE scheme is selectively attribute-hiding against 
chosen plaintext attacks under the RDSP and IDSP assumptions. For any ad- 
versary A, there exist probabilistic machines B\ and B 2 , whose running times 
are essentially the same as that of A, such that for any security parameter X, 

AdvJ e ’ ah (A) < Adv^ SP (A) + Advg 2 SP (A) + 3i t/q 

where v is the number of adversary’s queries. 

Proof Outline: To prove the security, we employ five games, Game 0 (origi- 
nal selective-security game) to Game 4 whose advantage is 0, where, roughly, 
Game 1 is conceptually changed (the timing of challenger’s coin flips is changed) 
from Game 0, a delegated key query (i.e., a reveal query of an already-created 
delegated key) is replied by using Gen Key (in place of Delegate) in Game 2, 
the plaintext part of the target ciphertext is randomized in Game 3, and the 
attribute vector part of the target ciphertext is randomized in Game 4. 

Since the distribution regarding each revealed key query in Game 2 is equiv- 
alent to that in Game 1 except with probability at most 3 /q, the gap between 
Games 1 and 2 is bounded by 3 v/q. 

To prove that the gap between Games 2 and 3 is bounded by the advantage of 
the RDSP assumption, target ciphertext ( 01 , 02 ) for m < ' b> is generated by using 
ep from the RDSP assumption such that ci := ep + (d n +i and C 2 := g^m^. 
Then ( 01 , 02 ) is a ciphertext in Game 2 when (3 = 0, and it is a ciphertext in 
Game 3 when f3 = 1. The key generation oracle simulation can be perfectly 
executed by using {h^* i,..., n ;fe=i,2,3 from the RDSP assumption (see 
Remark after Theorem GJ. It can be done similarly to evaluate the gap between 
Games 3 and 4 (through the IDSP assumption). 

Proof of Theorem 0 

To prove Theorem El we consider the following five games. 
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Game 0: Original game (Definition EJ) . 

Game 1: Game 1 is the same as Game 0 except the following procedures: 

1 . When challenger C gets challenge attributes ("a?^ , . . . , ) and , 

• • • , a^i)) in the first step of the game, C selects (challenge) bit b 
{0, 1}, and computes 

(x+,...,x+) := 

where h := h^ b \ (~xi, . ■ . , ~x h) ■= (a? • • • , ’5"®), (~Xh+i, ■ ■ ■ d) 

Wg h+1 ~' J ‘ h x • • • x and Si, . . . , Sd F g . 

2. When C gets challenge plaintexts from adversary A, chal- 

lenger C computes (ci,C2) as below and returns it to A. 

ci : = E"=1 x t b i + C d n+i + &+3&»+3, c 2 := g^m ^ , 
where 3, £ F g . 

Game 2: Game 2 is the same as Game 1 except the following procedures. 

1. When a create key query is issued by A, challenger C only records the 
specified predicates, and when a create delegated key query is issued, C 
only records the specified keys and predicates. In this step, C just records, 
but creates no corresponding keys. 

2. When a reveal key query is issued for a hierarchical (level-£) predicate 
(l?i, . . . ,~v e) which has been already recorded, C creates the queried 
key by using Gen Key. In addition, there is a special rule such that 
(cjo,i, . . . , ao,e) +— is selected again if Ylf-i a o ,t&t%t • ~vt = 0 in the 
computation process of Gen Key. 

Game 3: Game 3 is the same as Game 2 except the target ciphertext (ci,c 2 ) 
is generated as follows: 

ci : = E"=1 x t h i + Cib„ r i + C. 2 K +2 + 6 n+3 b n+3 , c 2 := g^m( b \ 
where 6„+ 3, C, Cl, C2 ^ F g . 

Game 4: Game 4 is the same as Game 3 except the target ciphertext (ci,c 2 ) 
is generated as follows: 

Cl := YJi = 1 u i b i + Ci&n+I + £2^+2 + S n+3 b n+3 , c 2 := g^m^, 
where S n+3 , £, £1, C2 ^ F g and 1 1 := (ui, ...,u n ) F” \ {if }. 

Let Adv^(A) be Adv^ PE ’ AH (A) in Game 0, and Adv^(A) (i = 1, . . . ,4) be the 
advantage of A in Game i. It is clear that Adv^(A) = Adv^(A), since it is a 
conceptual change. It is also clear that Adv^ (A) = 0 by Lemma 0| 

We will show three lemmas (Lemmas [I] |21 01 that evaluate the gaps between 
pairs of Adv^(A) (i = 1,2, 3, 4). From these lemmas, we obtain Adv^ PE ’ AH (A) = 
Adv^(A) = Adv^(A) < Ei= 1 |Adv^(A) - Adv^ +1) (A)| + Adv^ 4) (A) < Adv^ SP 
(A) +Advg 2 SP (A) +3u/q. □ 


Hierarchical Predicate Encryption for Inner-Products 227 


Lemma 1. For any adversary A, Adv^ (A) — Adv^(A)| < 3v/q. 

Proof. The distribution of k * i+1 generated by GenKey for a level- (^+1) predicate 
is equivalent to that by the combination of GenKey for the level- predicate and 
Delegate^ except with probability 2/q, from Claim Moreover, the special rule in 
Game 2 causes probability gap at most 1/q for each GenKey operation. Therefore, 
the revealed key distribution in Game 1 is equivalent to that in Game 2 except 
with probability at most (1 — (1 — 3 /q) v ) < 3 v/q, since the number of delegate 
queries is upper-bounded by v. Hence (by using Shoup’s difference lemma), the 
difference of Adv^(A) and Adv^(A) is upper-bounded by 3 v/q. □ 

Claim 1. If k l is generated by GenKey(pk, sk, (T*i, . . . , ~vi)), the distribution 
of k* l+1 generated by Delegate(pk, k* t , ve+i) is equivalent to that of k* t+l gen- 
erated by GenKey(pk, sk, ("r/i , ~v i, ~u%+i)) except with probability at most 2/q. 

Proof. The distribution of level-£ key kf - (j = 1, ...,£ + 1) is represented by 
that of the £ + 1 coefficients, ■ ■ ■ , <?j,e,Vj)> of Yli=u t _ i+i v i^e,i (* — 1, ■ • ■ ,£) 
and 6* +1 (and the coefficient, ip, of b* in addition when j = /m + 1. , n ) , since 

the coefficient of b*+ 2 is dependent of that of b* +1 . 

Similarly, the distribution of level- (f? + 1) key ( j = 1, ...,£ + 2) is 

represented by that of the i+ 2 coefficients, {aj.i , . . . , crj,e.+i ,Vj)- 

When level-£ key ^ (j = 1 , t+l) is generated by GenKey(pk, sk, (fv i , . . . , 
liiu K 4+1 is uniformly distributed. 

If coefficient matrix (aj t i, ■ ■ ■ , r]j)j=i,...,e+i {(£ + 1) x [l + 1) matrix) of 

(fe|j)j=i,...4+i is regular and ip ^ 0, then the coefficients, (cryp, . . . , *&'), 

of Delegate(pk, fc|,~u^+ 1) is uniformly distributed, i.e., Delegate(pk, fcj, vt+i) 
is equivalently distributed as GenKey(pk, sk, (~v i, . . . , T^+i)). 

Here, (cryi, . . . , <rj4 ) »7 i j)j=i,...,M-i ((£ + 1) X (£+ 1) matrix) of (fe|j)j=i,...4+i 
is regular and ip ^ 0 except with probability at most 2/q. □ 

Lemma 2. For any adversary A, there exists a probabilistic machine B\, whose 
running time is essentially the same as that of A, such that for any security 
parameter X, |Adv^(A) — Adv^(A)| = Advg^ SP (A). 

Proof. In order to prove Lemma El we construct a probabilistic machine B\ 
against the RDSP problem by using any adversary A in a security game (Game 2 
or 3) as a black box as follows: 

1. B\ is given RDSP instance (param,B, ~y,ep). 

2. B\ plays a role of challenger C in the security game against adversary A. 

3. When B\ (or challenger C) gets challenge attributes Cx]°\ ■ ■ ■ , ) and 

(Ir^, . . . in the first step of the game, B\ selects (challenge) bit 

6 -f— {0, 1}, and computes 


:= . . . ,5 d ~x d ) , 
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where h:=h^ b \ (a? 1,. . . ,H th) ■= ("af®,. ■ • ,"x ®), C%h + iv • ■ ,~x d) *** 

x • • ■ x ¥q-^ d -' L , and <5x, . . . , Sd F g . 

Let (mj) ^ {n e GL(n, F,) I ~y = ~x+ ■ II, J7 T = JT}, and iT* := 
(7 r*j) := ((tt i,j) T ) _1 - Note that ~x + = ~y ■ II* . Public parameter pk is then 
calculated as follows and Bi returns pk to A: 

bj := Ee=i b* := £^1 n*^ (j = 1 , . n), 

B := (bi, . . . , b n , d n +i,b n+ 3 ), pk := (1 A , param,B). 

4. When a reveal key query is issued for a hierarchical (level-£) predicate 

. . . , ~v 1 ) which has been already recorded, B\ answers as follows: for 
j = 0, . . . ,£+ 1 , m + 1, . . . ,n, Bi calculates 

: = (*$ti • • ***&«) := (!) 

where aj,i, . . . , <Tj.n ^ ¥ q . Then, B\ calculates and returns k \ := (fc| 0 , . . . , 
• • ■ , fe| iTl ) using in the RDSP instance: 

*0 := £ti «o, fc £f£ E:=x < s t™ , 
kl 0 := 0o 1 £fc=x «o,fc Eflx v cm Ee=i 1 

For j f= 1, . . . ,^+ 1,^ + 1, . . . ,.n; s = 1, 2, 

0,, s := ELi a Ik,s £ fix E;=i 

//,. : = Etx a J,k,s Eiix E"=x < >e 4 fe) *, 

:= 0j,2/ J *x - 0 j, if la* 

For j = fjg + 1, . . . ,n, 

For i = 1,...,/Z£,j, 

W := ELi Ofc E”=x ', m* := £tx Sfc £” =1 
%• : = % (Efix vtiViT 1 , fe £ : = k h + ESi > 

where oo.fe, Oj,fc )S , afc F fl for y = 1, , . . ,4+T, m + 1, . . . ,n; k = 1,2,3; s « 
1,2. 

If 00 = 0, {cr 0 , t ,a 0 ,fc ^}*=i,2,3;f=i,...,l is selected again. For j = m + 
1, ...,», if Efix vf&i = 0, {aj tt ,dk F g }fc = x i2 ,3;t=i,...^ is selected again. 

5. When B\ (or C) gets challenge plaintexts (m^°\ mS v> ) (from A), B\ calculates 
and returns (cx,c 2 ) s.t. Cx := eg + Qd n +-\ and c 2 := g^rn^ using eg in the 
RDSP instance, (, and rnS b K where £ ^ F g . 

6. After the encryption query, GenKey oracle simulation for a reveal key query 
is executed as above. 

7. A outputs bit b' . If b = b', B\ outputs ff := 1. Otherwise, B\ outputs & := 0. 
To prove Lemma 0 we show Claims 0 0 and 0 
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Claim 2. Public parameter pk generated in step 3 above has the same distribu- 
tion as that in Game 2 (and Game 3). 

Proof. Let D := j ^ be square (n + 3) X (n + 3) matrix composed of 

II and the identity matrix fy. Then basis (b \ . . . . , b n . b n + 1, b n+2 . 6^+3) of V is 
obtained from basis B by the linearjtransformation determined by D. Hence, its 
distribution is uniform. Therefore, B = (b\, . . . ,b n , , b n +s) in step 3 has the 
same distribution as that in Game 2 (and Game 3). □ 

Claim 3. Secret key k | generated in steps 4 and 6 above has the same distri- 
bution as that in Game 2 (and Game 3). 

Proof. First, we verify that basis (6* , . . . , 6* , fo* +1 , b* + 2 , b* +3 ) of V* is obtained 
by the linear transformation (£) T ) _1 , where D is defined in the proof of Claim |5J 
That is, it is dual orthonormal to basis (61, . . . , b n , b n+ \ , b n + 2, b n + 3). Therefore, 
we can consider kfj w.r.t. this dual orthonormal basis. 

Secret key fe| 0 generated in steps 4 and 6 is 9q 1 (Ek=i Ef=i v o,iK 

+O0 1 0 1 b* n+1 +6o 1 6 2 b* n+2 , where 0! := (ELi a o,fc7i fc) )"^o -"? + , #2 := (ELi a o ,k 
7 f)E+ ' an d 0o = 0i + 9 2 . Let a ■.= X (Efc=i a o,kU^). Then, cr, 61, 9 2 are 
independently uniform, since ao,fc are independently uniform, and 9q 1 9i+9q 1 9 2 = 
1. Also, from (© , the coefficients of EfEt.i+i v i^i * n fe| 0 for each 1 < t < £ are all 
uniformly and independently distributed. Therefore, generated fe| 0 has the same 
distribution as in Game 2 and Game 3. 

Similarly, for j = \ .... ,1+1, j.i£ + l , .... n, the j-th key fc| . has independently 
uniform coefficients w.r.t. Ef=u t _ 1+1 f° r each 1 <t<£, and the sum of the 
coefficients of 6* + i and b* +2 is zero. 

Finally, we investigate the distribution of the coefficients of b* in k%j for 
j = pi + 1 , . . . , n. The additional term m* — Zj Efii v t/i rn i ' s 

(eLi W fc >) Efii v+~b* + (eLi 5^ (fc) ) b* 

+ (to ,i - Zj Efil v£i<Pi,i) b n + 1 + faj - z i Ef= 1 b n+2> ( 2 ) 

where := (Efe=i“fc7i fc) ) ^2,i •= (Efc= and ^ 

Therefore, for j = pe + 1, . . . , n, the sum of the coefficients of 6* +1 and b* +2 in 
(J2J is zero, and the coefficients of b* in fe| ■ are common, Efc=i which is 

uniformly distributed. □ 

Claim 4. If 1 3 = 0, the distribution of (c\ . c 2 ) generated in step 5 is the same 
as that in Game 2. If (3 = 1, the distribution of (ci,c 2 ) generated in step 5 is 
the same as that in Game 3. 

Proof. If (3= 0, ci = (5i EHi Vibi + C^n+i + £2^+3 = <5i E"=i x t b i + C d n+1 + 
bib n+ 3 and c 2 := g^rn^ h ' 1 . This is the target ciphertext in Game 2 with pk := 
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(l\ param,B). If /3 = 1, a = a; i H& * + (C+Ci)6n+i + (C+C2)&n+2 + 52&n+3 

and C2 := g^rnS^ . Because £ + Ci > C + C25 and ( are independently uniform, this 
is the target ciphertext in Game 3 with pk := (1 A , param,B). □ 

Prom Claims El 01 an d El when j} = 0, the advantage of A in the above game is 
equal to that in Game 2, i.e., Adv^(A), and also is equal to Pr 0 := 
Pr |^i3i (1 A , p) — ► 1 1 p^- g$ DSP (l\ n)j . Similarly, when /? = 1, we see that the ad- 
vantage of A in the above game is equal to Adv^(A), and also is equal to Pri := 
Pr [#i(l\p)^l | p^g i RDSP (l A ,n)]. Therefore, |Adv^ ) (A)-Adv^ ) (A)| = |Pr 0 — 
Pri | = Advg^ SP (A). This completes the proof of Lemma El □ 

Lemma 3. For any adversary A, there exists a probabilistic machine £> 2 . whose 
running time is essentially the same as that of A, such that for any security 
parameter X, |Adv^(A) — Adv^(A)| = Adv^ 2 SP (A). 

Proof. Lemma 01 is similarly proved as Lemma El The proof will be given in the 
full version of this paper. □ 

Lemma 4. For any adversary A, Adv^(A) = 0. 

Proof. The value of b is independent from the adversary’s view in Game 4. Hence, 
Adv^(A)=0. □ 
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Abstract. Public- key encryption schemes rely for their IND-CPA secu- 
rity on per-message fresh randomness. In practice, randomness may be 
of poor quality for a variety of reasons, leading to failure of the schemes. 
Expecting the systems to improve is unrealistic. What we show in this 
paper is that we can, instead, improve the cryptography to offset the 
lack of possible randomness. We provide public-key encryption schemes 
that achieve IND-CPA security when the randomness they use is of high 
quality, but, when the latter is not the case, rather than breaking com- 
pletely, they achieve a weaker but still useful notion of security that we 
call IND-CDA. This hedged public-key encryption provides the best pos- 
sible security guarantees in the face of bad randomness. We provide sim- 
ple RO-based ways to make in-practice IND-CPA schemes hedge secure 
with minimal software changes. We also provide non-RO model schemes 
relying on lossy trapdoor functions (LTDFs) and techniques from deter- 
ministic encryption. They achieve adaptive security by establishing and 
exploiting the anonymity of LTDFs which we believe is of independent 
interest. 


1 Introduction 

Cryptography ubiquitously assumes that parties have access to sufficiently good 
randomness. In practice this assumption is often violated. This can happen be- 
cause of faulty implementations, side-channel attacks, system resets or for a 
variety of other reasons. The resulting cryptographic failures can be spectacu- 
lar |2 212412 9I2I15| . What can we do about this? One answer is that system de- 
signers should build “better” systems, but this is clearly easier said than done. 
The reality is that random number generation is a complex and difficult task, 
and it is unrealistic to think that failures will never occur. We propose a different 
approach: designing schemes in such a way that poor randomness will have as 
little as possible impact on the security of the scheme in the following sense. 
With good randomness the scheme achieves whatever (strong) security notion 
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one is targeting, but when the same scheme is fed bad (even adversarially cho- 
sen) randomness, rather than breaking completely, it achieves some weaker but 
still useful notion of security that is the best possible under the circumstances. 
We call this “hedged” cryptography. 

Previous work by Rogaway Rogaway and Shrimpton and Kamara 
and Katz m considers various forms of hedging for the symmetric encryption 
setting. In this paper, we initiate a study of hedged public-key encryption. We 
address two central foundational questions, namely to find appropriate defini- 
tions and to efficiently achieve them. Let us now look at all this in more detail. 

The problem. Achieving the standard IND-CPA notion of privacy m requires 
the encryption algorithm to be randomized. In addition to the public key and 
message, it takes as input a random string that needs to be freshly and indepen- 
dently created for each and every encryption. 

Weak (meaning, low-entropy) randomness does not merely imply a loss of 
theoretical security. It can lead to catastrophic attacks. For example, weak- 
randomness based encryption is easily seen to allow recovery of the plaintext 
from the ciphertext for the quadratic residuosity scheme of as well as the 
El Gamal encryption scheme m- Brown H3 presents such an attack on RSA- 
OAEP HD! with encryption exponent 3. Ouafi and Vaudenay (S3 present such 
an attack on Rabin-SAEP |X3! - We present an alternative attack in jZj- 

The above would be of little concern if we could guarantee good randomness. 
Unfortunately, this fails to be true in practice. Here, an “entropy-gathering” 
process is used to get a seed which is then stretched to get “random” bits for 
the application. The theory of cryptographically strong pseudorandom number 
generators m implies that the stretching can in principle be sound, and extrac- 
tors further allow us to reduce the requirement on the seed from being uniformly 
distributed to having high min-entropy, but we still need a sufficiently good seed. 
(No amount of cryptography can create randomness out of nothing!) In prac- 
tice, entropy might be gathered from timing-related operating system events or 
user keystrokes. As evidence that this process is error-prone, consider the recent 
randomness failure in Debian Linux, where a bug in the OpenSSL package led 
to insufficient entropy gathering and thence to practical attacks on the SSH m 
and SSL [213 fij protocols. Other exploits include f 2 fill Dj . 

The new notion. The idea is to provide two tiers of security. First, when the 
“randomness” is really random, the scheme should meet the standard IND-CPA 
notion of security. Otherwise, rather than failing completely, it should gracefully 
achieve some weaker but as-good-as-possible notion of security. The first impor- 
tant question we then face is to pick and formally define this fallback notion. 

Towards this, we begin by suggesting that the message being encrypted may 
also have entropy or uncertainty from the point of view of the adversary. (If not, 
what privacy is there to be preserved by encryption?) We propose to harvest this. 
In this regard, the first requirement that might come to mind is that encryption 
with weak (even adversarially-known) randomness should be as secure as deter- 
ministic encryption, meaning achieve an analog of the PRIV notion of jHj. But 
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achieving this would require that the message by itself have high min-entropy. 
We can do better. Our new target notion of security, that we call Indistinguisha- 
bility under a Chosen Distribution Attack (IND-CDA), asks that security is 
guaranteed as long as the joint distribution of the message and randomness has 
sufficiently high min-entropy. In this way, we can exploit for security whatever 
entropy might be present in the randomness or the message, and in particular 
achieve security even if neither taken alone is random enough. 

Notice that if the message and randomness together have low min-entropy, 
then we cannot hope to achieve security, because an adversary can recover the 
message with high probability by trial encryption with all message-randomness 
pairs that occur with a noticeable probability. In a nutshell, our new notion 
asks that this necessary condition is also sufficient, and in this way is requiring 
security that is as good as possible. 

We denote by H-IND our notion of hedged security that is satisfied by encryp- 
tion schemes that are secure both in the sense of IND-CPA and in the sense of 
IND-CDA. 

Adaptivity. Our IND-CDA definition generalizes the indistinguishability-style 
formalizations of PRIV-secure deterministic encryption |81 1 2 \ . which in turn ex- 
tended entropic security jTBj ■ But we consider a new dimension, namely, adaptiv- 
ity. Our adversary is allowed to specify joint message-randomness distributions 
on to-be-encrypted challenges. The adversary is said to be adaptive if these 
queries depend on the replies to previous ones. Non-adaptive H-IND means IND- 
CPA plus non-adaptive IND-CDA and adaptive H-IND means IND-CPA plus 
adaptive IND-CDA. 

Non-adaptive IND-CDA is a notion of security for randomized schemes that 
becomes identical to PRIV in the special case that the scheme is deterministic. 
Adaptive IND-CDA, when restricted to deterministic schemes, is an adaptive 
strengthening of PRIV that we think is interesting in its own right. As a conse- 
quence of the results discussed below, we get the first deterministic encryption 
schemes that achieve this stronger notion. 

Schemes with random oracles. Our random oracle (RO) model schemes and 
their attributes are summarized in the first two rows of the table of Figure P 
Both REwHl and REwH2 efficiently transform an arbitrary (randomized) IND- 
CPA scheme into a H-IND scheme with the aid of the RO. They are simple ways 
to make in-practice encryption schemes H-IND secure with minimal software 
changes. REwHl has the advantage of not changing the public key and thus not 
requiring new certificates. It always provides non-adaptive H-IND security. It 
provides adaptive H-IND security if the starting scheme has the extra property 
of being anonymous in the sense of U . Anonymity is possessed by some deployed 
schemes like DHIES P, making REwHl attractive in this case. But some in- 
practice schemes, notably RSA ones, are not anonymous. If one wants adaptive 
H-IND security in this case we suggest REwH2, which provides it assuming only 
that the starting scheme is IND-CPA. It does this by adding a randomizer to 
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Non-adaptive H-IND 

Adaptive H-IND 

REwHl 

IND-CPA 

IND-CPA + ANON-CPA 

REwH2 

IND-CPA 

IND-CPA 

RtD 

IND-CPA, PRIV 

IND-CPA, (u-)LTDF 

PtD 

(u-)LTDF 

(u-)LTDF 


Fig. 1 . Table entries for the first two rows indicate the assumptions made on the (ran- 
domized) encryption scheme that underlies the RO-model hedged schemes in question. 
The entries for standard model scheme RtD are the assumptions on the underlying 
randomized and deterministic encryption schemes, respectively, and for PtD, on the 
underlying deterministic encryption scheme, which is the only primitive it uses. 


the public key, so it does require new certificates. The schemes are extensions of 
the EwH deterministic encryption scheme of jO] and similar to 123 - 

Schemes without random oracles. It is easy to see that even the existence 
of a non-adaptively secure IND-CDA encryption scheme implies the existence of 
a PRIV-secure deterministic encryption (DE) scheme. Achieving PRIY without 
ROs is already hard. Indeed, fully PRIV-secure DE without ROs has not yet 
been built. Prior work, however, does show how to construct PRIV-secure DE 
without ROs for block sources [J2J. (Messages being encrypted have high min- 
entropy even conditioned on previous messages.) But H-IND introduces three 
additional challenges: (1) the min-entropy guarantee is on the joint message- 
randomness distribution rather than merely on the message; (2) we want a single 
scheme that is not only IND-CDA secure but also IND-CPA-secure; and (3) the 
adversary’s queries may be adaptive. 

We are able to overcome these challenges to the best extent possible. We pro- 
vide schemes that are H-IND-secure in the same setting as the best known PRIV 
ones, namely, for block sources, where we suitably extend the latter notion to 
consider both randomness and messages. Furthermore, we achieve these results 
under the same assumptions as previous work. 

Our standard model schemes and their attributes are summarized in the last 
two rows of the table of Figure [I] RtD is formed by the generic composition 
of a deterministic scheme and a randomized scheme and achieves non-adaptive 
H-IND security as long as the base schemes meet their regular conditions. (That 
is, the former is PRTV-secure for block sources and the latter is IND-CPA.) 
Adaptive security requires that the deterministic scheme be a u-LTDF. (A lossy 
trapdoor function whose lossy branch is a universal hash function jillirij .l PtD is 
simpler, merely concatenating the message to the randomness and then applying 
deterministic encryption. It achieves both non-adaptive and adaptive H-IND 
under the assumption that the deterministic scheme is a u-LTDF. For both 
schemes, the universality assumption on the LTDF can be dropped by modifying 
the scheme and using the crooked leftover hash lemma as per G2 (This is why 
the “u” is parenthesized in the table of Figure [I]) 
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Anonymous LTDFs. Also of independent interest, we show that any u-LTDF 
is anonymous. Here we refer to a new notion of anonymity for trapdoor functions 
that we introduce, one that strengthens the notion of This step exploits an 
adaptive variant of the leftover hash lemma of M- 

Why anonymity? It is exploited in our proofs of adaptive security. Our new 
notion of anonymity for trapdoor functions is matched by a corresponding one 
for encryption schemes. We show that any encryption scheme that is both 
anonymous and non-adaptive H-IND secure is also adaptively H-IND secure. 
Anonymity of the u-LTDF, in our encryption schemes based on the latter prim- 
itive, allows us to show that these schemes are anonymous and thereby lift their 
non-adaptive security to adaptive. 

Related work. In the symmetric setting, several works have recognized and 
addressed the problem of security in the face of bad randomness. Concern over 
the quality of available randomness is one of Rogaway’s motivations for introduc- 
ing nonce-based symmetric encryption E2j, where security relies on the nonce 
never repeating rather than being random. Rogaway and Shrimpton provide 
a symmetric authenticated encryption scheme that defaults to a PRF when the 
randomness is known. 

Kamara and Katz provide symmetric encryption schemes secure against 
chosen-randomness attack (CRA). Here the adversary can obtain encryption un- 
der randomness of its choice but privacy is only required for messages encrypted 
with perfect, hidden randomness. Entropy in the messages is not considered or 
used. We in contrast seek privacy even when the randomness is bad as long as 
there is compensating entropy in the message. Also we deal with the public key 
setting. 

Many works consider achieving strong cryptography given only a “weak ran- 
dom source” |28ll6ll4j| . This is a source that does have high min-entropy but may 
not produce truly random bits. They show that many cryptographic tasks in- 
cluding symmetric encryption EH|, commitment, secret-sharing, and zero knowl- 
edge m are impossible in this setting. We are not in this setting. We do assume 
a small amount of initial good randomness to produce keys. (This makes sense 
because it is one-time and because otherwise we can’t hope to achieve anything 
anyway.) On the other hand our assumption on the randomness available for en- 
cryption is even weaker than in the works mentioned. (We do not even assume 
it has high min-entropy.) Our key idea is to exploit the entropy in the mes- 
sage, which is not done in |28I16I14| . This allows us to circumvent their negative 
results. 

Waters independently proposed hedge security as well as the PtD construction 
as a way to achieve it pni- 

2 Preliminaries 

Notation. Vectors are written in boldface, e.g. x. If x is a vector then |x| denotes 
its length and x[i] denotes its i th component for 1 < i < |x|. We say that x is 
a vector over D if x[i] G D for all 1 < i < |x|. Throughout, k £ N denotes the 
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security parameter and l k its unary encoding. Unless otherwise indicated, an 
algorithm is randomized. The set of possible outputs of algorithm A on inputs 
Xi,X 2 , ■ ■ ■ is denoted [A(x\,X 2 , ■ ■ •)]• “PT” stands for polynomial-time. 

Games. Our security definitions and proofs use code-based games |0j, and so we 
recall some background from 0. A game (look at Figure |2|for examples) has an 
Initialize procedure, procedures to respond to adversary oracle queries, and a 
Finalize procedure. A game G is executed with an adversary A as follows. First, 
Initialize executes, and its outputs are the inputs to A. Then A executes, its 
oracle queries being answered by the corresponding procedures of G. When A 
terminates, its output becomes the input to the Finalize procedure. The output 
of the latter is called the output of the game, and we let G A => y denote the 
event that this game output takes value y. Our convention is that the running 
time of an adversary is the time to execute the adversary with the game that 
defines security, so that the running time of all game procedures is included. 

Public-key encryption. A public-key encryption (PKE) scheme is a tuple 
of PT algorithms AE = (P, K,. £. D) with associated message length parameter 
n(-) and randomness length parameter p(-). The parameter generation algorithm 
P takes as input l k and outputs a parameter string par. The key generation 
algorithm K takes input par and outputs a key pair ( pk,sk ). The encryption 
algorithm E takes inputs pk, message m £ {0, 1}"(0 and coins r £ {0, 1 } p ( k ) 
and returns the ciphertext denoted E(pk,m ; r). The deterministic decryption 
algorithm V takes input sk and ciphertext c and outputs either T or a message 
in {0, l}"( fc ). F or vectors m,r with |m| = |r| = v we denote by E(pk, m ; r) the 
vector (E(pk, m[l] ; r[l]), . . . ,E(pk, m[n] ; r[i>])). We say that AE is deterministic 
if £ is deterministic. (That is, p(-) = 0.) 

We consider the standard IND-CPA notion of security, captured by the game 
IND_ 4 £ where AE = (P, /C, E , D) is an encryption scheme. In the game, Initialize 
chooses a random bit b, generates parameters par *— * V(l k ) and generates a key 
pair (pk, sk) *-$ K(par) before returning pk to the adversary. Procedure LR, on 
input messages mo and mi, returns c<—$£(pk,rrib). Lastly, procedure Finalize 
takes as input a guess bit b' and outputs true if b = b' and false otherwise. An 
IND-CPA adversary makes a single query (mo, mi) to LR with |mo| = |mi|. 
For IND-CPA adversary A we let Adv^|”^ pa (fc) = 2- Pr [ IND^ £ k => true ] — 1 . 
We say AE is IND-CPA secure if Advjjg’ A (-) is negligible for all PT IND-CPA 
adversaries A. 

Sources. We generalize the notion of a source to consider a joint distribution on 
the messages and the randomness with which they will be encrypted. A t-source 
(t > 1) with message length n(-) and randomness length p(-) is a probabilistic 
algorithm A i that on input l k returns a (t+ l)-tuple (m 0 , . . . , m t _i, r) of equal- 
length vectors, where mo, . . . , m t _i are over {0, and r is over {0, 

We say that Ai has min-entropy fi(-) if 

Pr[(m b [f],r[i]) = (m,r)]<2-^ fc ) 
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for all k £ N, all b £ {0 . . . , t - 1}, all i and all (m, r) £ {0, 1} R W X {0, lpW. 
We say it has conditional min-entropy p(-) if 

Pr[(m fe [i],rH) = (m,r) ] V j < i (m b [j], r[j]) = (m'\jU\j]) ] < ^ k) 

for all k £ N, all b £ {0 1}, all i, all ( m,r ), and all vectors m',r'. A 

t-source with message length n(-), randomness length p(-), and min-entropy p(-) 
is referred to as a (p, n, p)- mr-source when t = 1 and p(-) > 0; a (p, n)-m-source 
when t = 1 and p(-) = 0; a (p, n, p)-mmr-source when t = 2 and p(-) > 0; 
and (p, n)-mm-source when t = 2 and p(-) = 0. Each “m” indicates the source 
outputting one message vector and an “r” indicates a randomness vector. When 
the source has conditional min-entropy p(-) we write block-source instead of 
source for each of the above. A n(-)-vector source outputs vectors of size v(k) for 
all k. 

Universal hash functions. A family of functions is a tuple H = ( V,1C,F ) 
with associated message length n(-). It is required that the domain of F(K,-) 
is {0,1}” for every k, every par £ ['P(l fc )], and every K £ [/C(par)]. We say 
that H is universal if for every k, all par £ [P(l k )\, and all distinct X\ . a ; 2 £ 
{0, 1}"W, the probability that F(K, x\) = F(K, X 2 ) is at most l/|i?(par)| where 
R(par) = { F(K, x) : K £ [/C(par)] and x G {0, 1}” } and the probability is over 
K <— * /C(par). 

Lossy Trapdoor Functions (LTDFs). To a deterministic PKE scheme (re- 
call that a family of injective trapdoor functions and a deterministic encryption 
scheme are, syntactically, the same object) A£ = (Pd Wd, ^d, ®d) with message 
length n,i(-) we can associate an (n<i, -Q-lossy key generator /Q. This is a PT 
algorithm that, on input par, outputs a value pk for which the map £,\(pk, ■) 
has image size at most 2 nd ^~^ k \ The parameter £ is called the lossiness of the 
lossy key generator. We associate to A £ , lossy key generator Kp and a LOS ad- 
versary A the function Adv^| fCi ,Ak)= 2-Pr[LOS^ iKlifc true] — 1, where 
game LOS^eXi works as follows. Initialize chooses a random bit b and gener- 
ates parameters part— *Pd(l fc )> if b = 0 runs ( pk,sk ) t—$JCa(par) and if b = 1 
runs pk <— $ /Q(par). It then returns pk (to the adversary A). When A finishes, 
outputting guess b', Finalize returns true if b = b'. We say fCi is universal- 
inducing if 7 i. = (Vd, Kp £d) is a family of universal hash functions with message 
length rid- 

A deterministic encryption scheme A£ is a (rid , tj-lossy trapdoor function 
(LTDF) if there exists a (n,i- £j-lossy key generator such that Adv]£| i/Cijj4 (-) is 
negligible for all PT A. We say it is a universal (n c j. -£)-lossy trapdoor function 
(u-LTDF) if in addition /C; is universal-inducing. 

Lossy trapdoor functions were introduced by Peikert and Waters EH, and can 
be based on a variety of number-theoretic assumptions, including the hardness of 
the decisional Diffie-Hellman problem, the worst-case hardness of lattice prob- 
lems, and the hardness of Paillier’s composite residuosity problem |31I12I34| . 
Boldyreva et al. P2j observed that the DDH-based construction is universal. 
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proc. Initialize(l fe ): 

par <— s V(l k ) 

(pi, sk ) <— $ K,{par) 

b <—*{ 0 , 1 } 

Ret par 


proc. LR(A / t): 

If pkout = true then 
Ret _L 

(m 0 ,mi,r) <-* M.{l k ) 
Ret £ (pk, mi; r) 


proc. RevealPK(): 

Ret pk 

proc. Finalize(fe'): 

Ret (6 = b') 


Fig. 2. Game CDA^f ,k 

3 Security against Chosen Distribution Attack 

Let A£ = (V, 1C, £, V) be an encryption scheme. A CDA adversary is one whose 
LR queries are all mmr-sources. Game CDA^f of Figure |2| provides the adver- 
sary with two oracles. The advantage of CDA adversary A is 

Adv^fr) = 2 • Pr [ CDA^ fe => true ] - 1 . 

In the random oracle model we allow all algorithms in Game CDA to access the 
random oracle; importantly, this includes the mmr-sources. 

Discussion. Adversary A can query LR with an mmr-source of its choice, an 
output (m 0 , mi, r) of which represents choices of message vectors to encrypt and 
randomness with which to encrypt them. (An alternative formulation might have 
CDA adversaries query two mr-sources, and distinguish between the encryption 
of samples taken from one of these. But this would mandate that schemes ensure 
privacy of messages and randomness.) This allows A to dictate a joint distri- 
bution on the messages and randomness. In this way it conservatively models 
even adversarially-subverted random number generators. Multiple LR queries 
are allowed. In the most general case these queries may be adaptive, meaning 
depend on answers to previous queries. 

Given that multiple LR queries are allowed, one may ask why an mmr-source 
needs to produce message and randomness vectors rather than simply a single 
pair of messages and a single choice of randomness. The reason is that the 
coordinates in a vector all depend on the same coins underlying an execution of 
At, but the coins underlying the execution of the sources in different queries are 
independent. 

Note that Initialize does not return the public key pk to A. A can get it 
at any time by calling RevealPK but once it does this, LR will return _L. 
The reason is that we inherit from deterministic encryption the unavoidable 
limitation that encryption cannot hide public-key related information about the 
plaintexts 0- (When the randomness has low entropy, the ciphertext itself is 
such information.) 

As we saw in the previous section, no encryption scheme is secure when 
both messages and randomness are predictable. Formally, this means chosen- 
distribution attacks are trivial when adversaries can query mmr-sources of low 
min-entropy. Our notions (below) will therefore require security only for sources 
that have high min-entropy or high conditional min-entropy. 
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Equality patterns. Suppose A makes a query At which returns (mo, mi, r) = 
((a, a), (a, a'), ( r , r)) for some a ^ a’ and random r. Then it can win trivially be- 
cause the (two) components of the returned vector c are equal if b = 0 and 
unequal otherwise. This limitation, again inherited from deterministic encryp- 
tion 0, is inherent. To capture it we associate to an mmr-source At an equality- 
pattern probability 

C(fc) = Pr [eq((m 0 ,r),(m 1 ,r)) = 0 : (m 0 ,mi,r)^iM(l fe )] 
where eq((xi,x 2 ), (yi,y 2 )) is 1 if for all i,j 

(xi[i],x 2 [i]) = ( Xl [j],x 2 [j]) iff (y 1 [f],y 2 [*]) = (yiblyab'D , 
and 0 otherwise. We point out that LR queries that are mmr-block-sources 
(and not, just, mmr-sources) with high conditional min-entropy have negligible 
equality-pattern probability. 

Notions. We can assume (without loss of generality) that a CDA adversary 
makes a single RevealPK query and then no further LR queries. We say A is 
a (/i, n, p)-adversary if all of its LR queries are (p, n. p)- mmr-sources. We say 
that a PKE scheme AS with message length rt(-) and randomness length p(-) is 
IND-CDA secure for (p, n. p)- mmr-sources if for all PT (p. n, p) adversaries A the 
function Adv^ A (-) is negligible. Scheme AS is H-IND secure for (p, n, p)-mmr- 
sources if it is IND-CPA secure and IND-CDA secure for (p, n, p)-mmr-sources. 
We can extend these notions to mmr-block-sources by restricting to adversaries 
that query mmr-block-sources. 

On adaptivity. We can consider non-adaptive IND-CDA security by restrict- 
ing attention in the notions above to adversaries that only make a single LR 
query. Why do we not focus solely on this (simpler) security goal? The standard 
IND-CPA setting (implicitly) provides security against multiple, adaptive LR 
queries. This is true because in that setting a straightforward hybrid argument 
shows that security against multiple adaptive LR queries is implied by security 
against a single LR query . We wish to maintain the same standard of adap- 
tive security in the IND-CDA setting. Unfortunately, in the IND-CDA setting, 
unlike the IND-CPA setting, adaptive security is not implied by non-adaptive 
security. In short this is because a CDA adversary necessarily cannot learn the 
public key before (or while) making LR queries. To see the separation, consider 
a PKE scheme that appends to every ciphertext the public key used. This will 
not affect the security of the scheme when an adversary can only make a single 
query. However, an adaptive CDA adversary can query an mmr-source, learn 
the public key, and craft a second source that uses the public key to ensure 
ciphertexts which leak the challenge bit. 

Given this, our primary goal is the stronger notion of adaptive security. That 
said, non-adaptive hedge security is also relevant because in practice adap- 
tive adversaries might be rare and (as we will see in Section EJl one can find 
non-adaptively-secure schemes that are more efficient and/or have proofs under 
weaker assumptions. 


Hedged Public-Key Encryptic 


241 


Adaptive PRIV. A special case of our framework occurs when the PKE scheme 
AS being considered has randomness length p(k) = 0 for all k (meaning also that 
adversaries query mm-sources, instead of mmr-sources). In this case we are con- 
sidering deterministic encryption, and the IND-CDA definition and notions give 
a strengthening (by way of adaptivity) of the PRIV security notion from jbl8H2j . 
(For non- adaptive adversaries the definitions are equivalent.) For clarity we will 
use PRIV to refer to this special case, and let Adv^ A (fc) = Adv A j( A (fc). 

Resource USAGE. Recall that by our convention, the running time of a CDA 
adversary is the time for the execution of the adversary with game CDA^f 
Thus, A being PT implies that the mmr-sources that comprise A’s LR queries 
are also PT. This is a distinction from which will be important in our results. 
Note that in practice we do not expect to see sources that are not PT, so our 
definition is not restrictive. Non-PT sources were needed in H2| for showing 
that single-message security implied (non-adaptive) multi-message security for 
deterministic encryption of block sources. 

4 Constructions 

Here we present several constructions for hedged encryption. The first scheme 
uses a random oracle and an IND-CPA secure probabilistic encryption scheme. 
The next two schemes derive from composing a randomized encryption scheme 
with a deterministic one (there are two ways of ordering composition) . Interest- 
ingly, only one ordering will end up providing security. The final scheme con- 
verts a deterministic encryption scheme to a hedged one by padding the message 
with random bits. For the following, let A£ r = (V T ,JC T ,£ T ,'D I ) be a (random- 
ized) PKE scheme with message length n r (-) and randomness length p(-). Let 
ASd = (VdAd: £<i: S > d) be a (deterministic) PKE scheme with message length 
n<i( - ) and randomness length always 0. Associate to A£ c for c G {d,r} the func- 
tion maxclen c (fc) mapping any k to the maximum length (over all possible public 
keys, messages, and if applicable, randomness) of a ciphertext output by £ c . 

Randomized-encrypt-with-hash. Let 7 Z : {0,1}* — *■ {0,1}* be a random 
oracle. Let REwH[.4£ r ] = (V, /C, £, V) be the scheme parameterized by random- 
izer length k that works as follows. Parameter generation, and decryption are 
the same as in A£ r ■ Key generation runs /C r (par r ) to get ( pk T ,sk r ), chooses 
K <— $ {0, l} K ( fe ), and lets pk = (pk r || K) and sk = sk T . Algorithm £ K , on 
input ( pk,m ) where pk = ( pk T || K ), chooses r •*— * {0, 1}pW and computes 
r' <— lZ(pk r || K || r || m) (where here we take 7£’s output to be of length p(k)) and 
outputs £ I (pk T ,m ; r'). Intuitively, the random oracle provides perfect and (as 
long as m and r are hard to predict) private randomness. When the key length 
n{k) = 0 for all k, we refer to the scheme as REwHl, while when n(k) > 0 for all 
k we refer to the scheme as REwH2. The scheme extends the Encrypt-with-Hash 
deterministic encryption scheme from |JjJ, which is a special case of REwHl when 
r has length 0, and is also reminiscent of constructions in the symmetric setting 
that utilize a PRF to ensure good randomness |27IT1| . as well as schemes using 
the Fujisaki-Okamoto transform EDJ. 
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Deterministic-then-randomized. Our first standard model attempt is to 
perform hedged encryption via first applying deterministic encryption and then 
randomized. More formally let DtR[M£ r , AS d ] = (V, 1C, £, D ) be the scheme that 
works as follows. The parameter generation algorithm V runs par r <— $ V T (l k ) 
and par d <— * V<i(l k ) and outputs par = (par r , par d ). Key generation /C just runs 
(pk r , sk r ) 4— $ /C r (par T ) and (pk d , ska) ® /C d (par d ) and outputs pk = ( pk r , pk d ) 
and sk = (sk r , ska)- We define encryption by 

£((pk r ,pk d ),m ; r) = £ T (pk T ,c\\ 10 e ; r) , 

where c = £d(pk d , m) and i = n r — | c| — 1. Here we need that n T (k) > maxclena(fc) 
for all k. Decryption is defined in the natural way. The scheme will clearly inherit 
IND-CPA security from the application of £ r . If the deterministic encryption 
scheme is PRIV secure for min-entropy p, then the composition will also be 
secure if the message has min-entropy at least p. However, our strong notion of 
IND-CDA security requires that schemes be secure if the joint distribution on the 
message and randomness has high min-entropy. If the entropy is unfortuitously 
split between both the randomness and the message, then there is no guarantee 
that the composition will be secure. In fact, many choices for instantiating A£ r 
and A£d lead to a composition for which attacks can be exhibited (even when 
the schemes are, separately, secure). 

Randomized- then-deterministic. We can instead apply randomized encryp- 
tion first, and then apply deterministic encryption. Define RtD[M£ r ,M£d] = 
{V ,K,,£ ,T>) to work as follows. The parameter and key generation algorithms 
are as for scheme DtR. Encryption is defined by 

£((pk r ,pk d ),m; r ) = £ d (pk d ,c [| 10 £ ) . 

where c = £ r (pk r ,m ; r ) and £ = n d — c 1. Here we need that n d (k) > 
maxclen r (fc) for all k. The decryption algorithm V works in the natural way. As 
we will see, this construction avoids the security issues of the previous, as long 
as the randomized encryption scheme preserves the min-entropy of its inputs. 
(For example, if for all k, all par r e ['P r ( l fc )], and all (pk r ,sk r ) e [/C r (par r )], 
£ r (pk r , •) is injective in (m, r).) Many encryption schemes have this property; El 
Gamal EB is one example. 

Pad-then-Deterministic. Our final construction dispenses entirely with the 
need for a dedicated randomized encryption scheme, instead using simple padding 
to directly construct a (randomized) encryption scheme from a deterministic one. 
Let PtD[M£d] = (Pd, )Cd, £• D) work as follows. Parameter and key generation are 
inherited form the underlying (deterministic) encryption scheme. Encryption is 
defined by 

£(pk d , m ; r) = £ d (pk d , r\\m) . 

Decryption proceeds by applying V d , to retrieve r || m, and then returning m. 
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5 Non-adaptive Hedge Security 

In this section we investigate the non-adaptive hedge security of REwH, RtD and 
PtD, leaving adaptive security to future sections. 

Randomized-encrypt-with-hash. Intuitively, the security of REwH[_A£ r ] fol- 
lows from the IND-CPA security of A£ r and the random oracle providing “per- 
fect” randomness. Following jOj, for any k let maxpk^ (fc) be the maximum of 
Pr [ pk = w : (pk, sk) «— * /C(par) ], where the maximum is taken over all w £ 
{0, 1}* and all par £ [P(l k )\. 

Theorem 1. [REwH is non-adaptive H-IND secure]. Let A£ r = (V r ,lC r , 
£ r ,L) r ) be a PKE scheme with message length n(-) and randomness length p and 
let AS = REwH[A£ r ] = (P r , K r , £, V r ) be the PKE scheme constructed from it. 

• (IND-CPA) Let A be an IND-CPA adversary. Then there exists an IND- 
CPA adversary B such that for all k 

Adv“j a (fc) = Adv^| a ( fc ) 

where B runs in time that of A and makes the same number of queries. 

• (IND-CDA) Let A be an adversary that makes a single LR query consisting 
of a v(-)-vector (p,n, p)-mmr-source with equality-pattern probability £(•) 
and making at most h(-) random oracle queries. Then there exists an IND- 
CPA adversary B such that for all k 

Adv^ d |; A (ft) < v{k) ^Adv^^fe) + + 8 • maxpk_ A£r (i)') + C (fc) 

Adversary B runs in time that of A and maxpk^ is the maximum public 
key probability of A£ r O 

The first part of the theorem is straightforward to prove. The second follows 
from an adaptation of the proof of security for the similar Encrypt-with-Hash 
deterministic encryption scheme in 0. Notice that the theorem holds for both 
REwHl and REwH2; the only difference is that with the latter the maxpk^gffc) 
term improves depending on the length k. 

Randomized- then-deterministic. Intuitively, the non-adaptive hedged se- 
curity of the RtD construction is inherited from the IND-CPA security of the 
underlying randomized scheme A£ T and the (non-adaptive) PRIV security of 
the underlying deterministic scheme AS a . As alluded to before, we have one 
technical requirement on A£ r for the IND-CDA proof to work. We say A£ r = 
(P r ,IC r ,£ r , V r ) with message length n r (-) and randomness length p(-) is min- 
entropy preserving if for any k , any par r £ l fe )], any (pk r ,sk r ) £ [/C r (par r )], 
and for all c £ {0, 1}* it is the case for any (p, n T , p)-mr-source AA. outputting 
vectors of size one that Pr [ c = £ r (pk r , m ; r) : (m, r) «— s M.{l k ) ] < 2 -/i . In 
words, encryption preserves the min-entropy of the input message and random- 
ness. We have the following theorem. 
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Theorem 2. [RtD is non-adaptive H-IND secure]. Let A£ r = (P r ,lC r ,£ r , 
T> r ) be a min-entropy preserving PKE scheme with message length n r {-) and 
randomness length p(-). Let A£d = ( J > d,fc'<h£<h'£ > d ) &e a (deterministic) en- 
cryption scheme with message length nd(-) so that nd(-) > maxclen r (-). Let 
A£ = RtD[A£ r , A£ <i\ = (V,K.,£, D) be the PKE scheme defined in Section 

• (IND-CPA) Let A be an IND-CPA adversary. Then there exists an IND- 
CPA adversary B such that for any k 

Adv“;rW = Adv^-r( fc ) 

where B runs in time that of A plus the time to run £d once. 

• (IND-CDA) Let A be a CD A adversary that makes one LR query consisting 
of a v(-)-vector (p,n r , p)-mmr-source (resp. block-source). Then there exists 
a PRIV adversary B such that for any k 

Adv^O) < Adv^ \ B (k) 

where B runs in time that of A plus the time to run v(k ) executions of £ r 
and makes one LR query consisting of a vf) -vector (//. maxclen r ) -mm-source 
(resp. block-source). P 

Note that the second part of the theorem states the result for either sources or 
just block-sources. We briefly sketch the proof. The first part of the theorem is im- 
mediate from the IND-CPA security of A£ r . For the second part, any mmr-source 
AA queried by A is converted into an mm-source A i' to be queried by B. This is 
done by having AA' run AA to get (m 0 , mi , r) and then outputting the pair of vec- 
tors (£ T (pk,m 0 ; r),£ r (pk. mi ; r)). (The ciphertexts are the “messages” for £d-) 
Because A£ r is min-entropy preserving, AA' is a source of the appropriate type. 

Pad-then-deterministic. The security of the PtD scheme is more difficult to 
establish. The IND-CDA security is inherited immediately from the PRIV secu- 
rity of the A £ d scheme. Here the challenge is, in fact, proving IND-CPA security. 
For this we wih need a stronger assumption on the underlying deterministic en- 
cryption scheme — that it is a u-LTDF. 

Theorem 3. [PtD is non-adaptive H-IND secure]. LetAEd = (' Pd,l^d,£d, 
T>d) be a deterministic encryption scheme with message length rid(-). Let A£ = 
PtD[A£d] = (V,K.,£,D) be the PKE scheme defined in Section^with message 
length n(-) and randomness length p(-) such that n(k) = nd{k) — p(k) for all k. 

• (IND-CPA) Let JCi be a universal-inducing (nu, £)-lossy key generation algo- 
rithm for A£d ■ Let A be an IND-CPA adversary. Then there exists a LOS 
adversary B such that for all k 

Ad v^J a (fc) < Ad^Z dXl , B (kj + V23 »(*M(*)+ 2 • 

B runs in time that of A. 

• (IND-CDA) Let Abe a CD A adversary that makes one LR query consisting 
of a v(-)-vector (p. n, p)-mmr-source (resp. block- s ource) . Then there exists 
a PRIV adversary B such that for all k 
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Ad ^A£.A(k) < Adv^^/c) 

where B runs in time that of A and makes one LR query consisting of a 
v(-) -vector (p, nfj-mm-source (resp. block-source). □ 

One might think that concluding IND-CPA can be based just on PtD being 
IND-CDA secure, since the padded randomness provides high min-entropy. How- 
ever, this approach does not work because an IND-CPA adversary expects knowl- 
edge of the public-key before making any LR queries, while a CD A adversary 
only learns the public-key after making its LR queries. This issue is discussed 
in more detail in jSj . We use a different approach (which may be of independent 
interest) to prove this part of Theorem 01 the details are given in the full ver- 
sion |Zj. Our proof strategy, intuitively, corresponds to using the standard LHL 
2"( fc ) times, once for each possible message the IND-CPA adversary might query. 

6 Anonymity for Chosen Distribution Attacks 

In the previous section we proved non-adaptive security for the RtD and PtD con- 
structions. But, as established in Section 01 wc actually want to meet the stronger 
goal of adaptive security. In the adaptive setting, adversaries can make multiple 
LR queries, specifying sources that are generated as a function of previously-seen 
ciphertexts. Recall that one reason adaptivity is difficult to achieve is because ci- 
phertexts might leak information about the public key. In turn, knowledge of the 
public key leads to trivial IND-CDA attacks. This suggests a natural relationship 
with key privacy, also called anonymity 0. Anonymity requires (informally) that 
ciphertexts leak no information about the public key used to perform encryp- 
tion. In this section we formalize a notion of anonymity for chosen-distribution 
attacks. In the next section we’ll use this definition as a step towards adaptive 
IND-CDA security. 

Definitions. Let AS = ( V,IC,S,V ) be an encryption scheme. Game ANON^f 
shown in Figure Olprovides the adversary with two oracles. An ANON adversary Ais 
one whose queries are all mr-sources. The advantage of ANON adversary A is 

Ad v A£ n A(k) = 2 - Pr [ ANON^ fc => true] - 1 . 

We say that a PKE scheme AS with message length n(-) and randomness length 
p(-) is ANON secure for (p, n, p)- mr-sources if for all PT adversaries A that 
only query (p, n, p)- mr-sources the function Adv^^Q i s negligible. We can 
extend this notion to mr-block-sources in the obvious way. In the special case 
that the randomness length of AS is always zero, the ANON definition formal- 
izes anonymity for deterministic encryption or, equivalently, trapdoor functions, 
generalizing a definition from jlj. 

Discussion. Anonymity for PKE in the sense of key privacy was first formal- 
ized by Bellare et al. j3], but their notion (analogously to traditional semantic 
security) only works in the context of good randomness. The ANON notion, 
akin to IND-CDA, formalizes key privacy in the face of bad randomness. While 
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proc. Initialize(fc): 

proc. Enc(Al): 

proc. LR(A4): 

proc. Finalize(a'): 

par <— %V(l k ) 

If pkout = true 

( m , r) <— * M(l k ) 

Ret (a = a') 

( pk 0 , sk 0 ) <— * /C(par) 

Ret 4 

c+~£(pk a , m;r) 


(pi l5 ski) <— * /C(par) 

( m , r) <— s A4(l fc ) 

pkout <— true 


a -*{0,1} 

Ret £ (pk 0 , m; r) 

Ret (pk 0 ,pk 1 , c) 


Ret par 





Fig. 3. Game ANON^ £ , fc 

we will use it mainly as a technical tool to simplify showing that schemes meet 
adaptive IND-CDA, it is also of independent interest as a new security target 
for PKE schemes when key privacy is important. (That is, one might want to 
hedge against bad randomness for anonymity as well as message privacy.) 

7 Adaptive Hedge Security 

The following theorem, whose proof appears in the full version [Zj, shows that 
achieving ANON security and non-adaptive IND-CDA security are sufficient for 
achieving adaptive IND-CDA security. 

Theorem 4. Let A£ = (V, 1C. £, V) be an encryption scheme with message 
length n(-) and randomness length p(-) . Let A be a IND-CDA adversary mak- 
ing q (- ) LR queries, each being a v(-) -vector (p, n. p)-mmr-source (resp. block- 
source). Then there exist IND-CDA adversary B and ANON adversary C such 
that for all k 

Adv A£,Ai k ) ^ M k ) ' Ad ^A£,Bi k ) + 4 ?(*0 • Adv X£c( fc ) • 

B makes one LR query consisting of a v(-) -vector (p, n, p)-mmr-source (resp. block- 
source). C makes at most q(k) — 1 Enc queries and one LR query, all these con- 
sisting ofv(-)-vector(p,n,p)-mr-sources (resp. block- sources). Both B andC run 
in the same time as A. □ 

Given a non-adaptively IND-CDA secure scheme, Theorem 0| reduces the task of 
showing it adaptively secure to that of showing it meets the ANON definition. 
Of course, ANON is still an adaptive notion. (Adversaries can formulate their 
LR query to be a source that’s a function of previously seen ciphertexts.) Nev- 
ertheless, it formalizes a sufficient condition for adaptive CDA security of any 
PKE scheme and captures the relationship between adaptivity and anonymity. 
We believe this is an interesting (and novel) application of anonymity. 

We can show that our random oracle scheme REwH is ANON secure when the 
underlying randomized scheme meets the traditional notions of anonymity for 
PKE (I]. We also want to show that the RtD and PtD schemes are ANON secure. 
We first show something more general: that any u-LTDF is anonymous. Then, 
that RtD and PtD are anonymous follows when using deterministic schemes that 
are also u-LTDFs. 
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Universal LTDFs are anonymous. Intuitively u-LTDFs are anonymous be- 
cause the lossy mode admits a universal hash, implying that no information 
about the public key is leaked by outputs (generated from sources with high con- 
ditional min-entropy). One might expect that formalizing this intuition would 
follow from straightforward application of the Leftover Hash Lemma (LHL) 
However our anonymity definitions are adaptive, so one cannot apply the LHL 
(or even the generalized LHL m directly. Rather, we first show an adaptive 
variant of the LHL is implied by the standard LHL via a hybrid argument. See 
the full version for details. Here we use it to prove the following theorem; details 
appear in the full version . 

Theorem 5. Let AS d = ( Vd , fcd, Sd, T>d) be a ( deterministic) encryption scheme 
with message length n(-) and an associated universal-inducing (n, i)-lossy key gen- 
erator ICi. Let A be an ANON adversary making q(-) Enc queries and a single LR 
query, all of these being v{-)-vector (fx, n)-m-block- sources. Then there exists LOS 
adversary B such that for all k 

Adv A£°"A(fc) < 2 • Adv^s (jfe) + 3 ■ q(k) -v{k)- ^2 n(fc)-/(fc)-#.(fc) . 

B runs in time that of A. □ 

Consider RtD and PtD when instantiated with a deterministic encryption scheme 
that is a u-LTDF. We can apply Theorem 0 to conclude ANON security for 
both schemes. Combining this with Theorems |2| and 0| yields proof of adaptive 
hedge security for RtD. Likewise, combining it with Theorems 0 and 0| yields 
proof of adaptive hedge security for PtD. Also Theorems El and El combine with 
U2| Th. 5.1] to give the first adaptively-secure deterministic encryption scheme 
(based on u-LTDFs). 

REwH2 is adaptively secure. As we show above, we can get adaptive security 
from REwH when the underlying IND-CPA randomized scheme is anonymous 
in the sense of @j. We observe that scheme REwH2 is adaptively secure when 
instantiated with any IND-CPA randomized scheme (not just anonymous ones). 
To show this, we give a direct proof in the full version [Zj. Since popular encryp- 
tion schemes such as RSA are not anonymous, we believe scheme REwH2 could 
be relevant in practice. That being said, we still think REwHl is important since 
non-adaptive security is still a strong notion, and the scheme does not require 
any changes to the structure of the public key. 

Extensions. In the full version [Zj we discuss extensions and variants of RtD 
and PtD, where we improve the (adaptive) concrete security and show how to 
securely use LTDFs that are not necessarily universal. 
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Abstract. Secure multi-party computation has been considered by the 
cryptographic community for a number of years. Until recently it has 
been a purely theoretical area, with few implementations with which to 
test various ideas. This has led to a number of optimisations being pro- 
posed which are quite restricted in their application. In this paper we 
describe an implementation of the two-party case, using Yao’s garbled 
circuits, and present various algorithmic protocol improvements. These 
optimisations are analysed both theoretically and empirically, using ex- 
periments of various adversarial situations. Our experimental data is 
provided for reasonably large circuits, including one which performs an 
AES encryption, a problem which we discuss in the context of various 
possible applications. 


1 Introduction 

That secure multi-party computation can be executed at all is considered one 
of the main results of the theory of cryptography. Starting with Yao’s seminal 
work many authors have looked at various optimisations and extensions to 
the basic concept, for both the two-party and the multi-party settings, see for 
example 0 , 0 , [HI 0 , 0 0 0 ] . Until recently all work on secure multi-party 
computation has been essentially of a theoretical nature, focusing on feasibility 
results. However in the last few years a number of practical implementations 
have appeared SM0I21. 

There are many different protocols for secure multi-party computation. Our 
work focuses on implementation of secure computation and therefore we only 
mention protocols which have been previously implemented. Secure multi-party 
computation essentially comes in two flavours. The first approach is typically 
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based upon secret sharing and operates on an arithmetic circuit representation 
of the computed function, such as in the BGW (Ben-Or, Goldwasser and Wigder- 
son) or CCD (Chaum, Crepeau and Damgard) protocols 0,0]. This approach is 
usually applied when there is an honest majority among the participants (which 
can only exist if more than two parties participate in the protocol). An alter- 
native approach represents the function as a binary circuit. This approach was 
used in the original two-party garbled circuit construction of Yao LjO , and in 
the GMW (Goldreich, Micali and Wigderson) multi-party protocol jH|. 

The arithmetic circuit method is better at representing addition and multipli- 
cation operations, where parties have additive shares of secret values, but cannot 
be used to compute comparisons unless the shares are converted to shares of the 
binary representation of the values. This approach has been used to great effect 
in the SIMAP project 0 , which has resulted in a “real-life” application of secure 
multi-party computation to the Danish sugar beet industry p . 

The binary circuit approach handles arithmetic operations, especially mul- 
tiplications, less efficiently, but can easily compute binary operations such as 
comparisons. This second approach, which forms the basis of Yao’s construction 
for the two party case, has been implemented by Malkhi et al. in the Fairplay 
system M . That system also provides a method to compile a given functionality 
from a representation in a high-level language into a circuit, which is then in- 
terpreted by a run-time environment that performs the secure evaluation of this 
functionality. FairplayMP, an extension of Fairplay to the case of more than two 
parties using a modified version of the protocol of Beaver et al. 0] has recently 
been released 0] ■ All these implementations provide security against semi- honest 
adversaries only. A major advantage of the binary circuit based systems (Fair- 
play and FairplayMP) is that they run in a constant number of communication 
rounds, whereas the SIMAP system has the advantage of being able to process 
arithmetic operations very efficiently. 

Efficient extensions of Yao’s construction to more relevant adversarial models 
have been a topic of research interest in the last few years. There are several 
constructions which aim to secure the protocol against malicious adversaries 
without using generic zero- knowledge protocols. We will focus on the construc- 
tion of Lindell and Pinkas a which is efficient and provides fully simulatable 
security according to the definition of Canetti m A definition of a weaker class 
of corruption, “covert adversaries”, and a protocol secure against this type of 
behavior, was provided by Aumann and Lindell 0|. In m an implementation 
of the basic Lindell-Pinkas protocol was reported upon and experimental data 
in various security models was provided. 

1 This construction may be preferable over other two-party protocols with secu- 
rity against malicious adversaries. The construction of Mohassel and Franklin 0 
only protects privacy and is not fully simulatable. The construction of Jarecki and 
Shmatikov |l3] requires the use of public-key operations, rather than symmetric key 
operations, for any gate of the circuit. The construction of Nielsen and Orlandi |2fl| . 
too, uses public key operations, or rather public-key based commitments, for each 
key of every wire of the circuit. A precise practical comparison between the different 
approaches is beyond the scope of the current paper. 
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In this paper we improve on the implementation of in a number of ways. 
The resulting set of quantitative improvements results in qualitative conclu- 
sions: (1) We demonstrate that two-party computation, secure against malicious 
adversaries, is truly practical, and we experimentally identify the performance 
bottlenecks which remain after our optimisations. This result should direct fur- 
ther research to the issues which have the largest effect on performance. (2) We 
experiment with a secure computation of the AES standard, and show that it 
is indeed feasible, even with security against malicious adversaries. There are a 
number of applications of such an implementation, some of which we describe be- 
low. (3) We provide the first implementation of a protocol with security against 
covert adversaries and we compare the performance of all 3 types of protocols: 
malicious, covert and semi- honest. 

A more detailed summary of our main results is as follows: 

— We improve the communication cost for transmitting the circuits between the 
parties. In the case when we model the underlying key derivation functions 
(KDFs) as correlation robust (see discussion below), using the technique of 
fia ] we are able to transmit no information for the XOR gates within the 
circuit. In this situation we are also able to reduce the data which needs to 
be sent by 25% for the other gates. When we are not willing to model the 
KDFs as correlation robust, and we only assume they are psuedo-random 
functions, we are unable to perform the free XOR optimisation. However 
we are able to reduce the communication cost for all gates by 50%. Unlike 
other methods used to improve communication, like [fl |. our improvement 
makes a marginal impact on computational costs. We will return to this in 
a later section. 

— In addition to the theoretical analysis we provide experimental data for eval- 
uating “real life” circuits, in both the honest-but-curious, covert and mali- 
cious adversary cases; also for the two different methods in the literature that 
construct the auxiliary circuits in the covert and malicious cases (see and 
the full version). The implementation for the malicious setting is based on 
the construction of Lindell and Pinkas 0 which provides security in the 
sense of full simulatability. Therefore the resulting construction can be used 
as a black-box primitive in more complex applications. The use of our opti- 
misations results in a considerable performance boost compared to previous 
experimental results published in |22]. 

Our optimisations change the performance bottleneck to a different part 
of the computation; namely, the verification of garbled circuits generated 
by the circuit constructor. This observation is important for focusing future 
research on the issues that affect the overhead the most. 

— We experiment with secure evaluation of a circuit which computes an AES 
encryption of a single block. The secure computation of AES involves one 
party which knows the key, and a different party which has an input block. 
The second party learns the encryption of the block, while the first party 
learns nothing. We demonstrate the feasibility of computing this function in 
the semi-honest, covert and malicious settings. 
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Secure evaluation of AES has an impact in a number of scenarios which we will 
discuss in short here and elaborate on in the full version. The fact that a secure 
computation of AES is feasible, and can run in a matter of seconds, is quite 
surprising. 

Application 1, OPRF: A secure computation of a pseudo-random function, 
denoted OPRF for “oblivious prf”, has been defined in (j| for the purpose of 
secure keyword based searches, and was subsequently used in different applica- 
tions. The OPRF protocol in 0 is based on the Naor-Reingold prf, which is a 
number theoretic construction. Our construction has different advantages over 
the NR based construction, which we detail in the full version. 

Application 2, Side Channel Protection: In 0 the authors introduce 
“one-time programs”, which are programs that can only be executed once and 
then “self-destruct”. An important advantage of this construction is that the 
execution of the program reveals no side-channel information. Most of the com- 
putation in that construction is essentially done using a garbled Yao circuit. 

One of the main applications of smart cards is to compute symmetric encryp- 
tions, and therefore the ability to compute AES encryptions by Yao circuits has 
immediate application in the above scenario. It enables smart cards to perform 
a one-time computation, secure against side-channel attacks, of AES. This is 
particularly interesting since in that setting the circuit evaluation need only be 
secure against semi- honest adversaries, while we show below that semi- honest 
computation of AES can be run very efficiently, taking only a few seconds. 

Application 3, Blind MACs and Blind Encryption: One can think of the 
operation of obtaining the AES encryption of a message, under the other party’s 
secret key, as a blind MAC or a blind symmetric encryption. These operations 
have different applications in secure computation. 

Application 4, Third Party Operations on Encrypted Data: We essen- 
tially show that encryption and decryption can be implemented using circuits. 
This enables secure computation of homomorphic operations on encrypted data. 
This operation is done by a circuit which receives two ciphertexts from one party 
and a key from the other party, decrypts the ciphertexts, applies some arbitrary 
mathematical operation to the plaintexts, and then encrypts the result. 

2 Yao’s Garbled Circuit Construction 

Two-party secure function evaluation makes use of the famous garbled circuit 
construction of Yao 0 which we briefly overview in this section. The basic 
idea is to encode the function to be computed via a binary circuit and then to 
securely evaluate the circuit on the players’ inputs. 
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2.1 Garbled Circuits 

We consider two parties, denoted as Pi and P2, who wish to compute a function 
securely which is represented as a simple binary circuit. First assume the circuit 
consists of only a single gate with two input wires and one output wire. We 
denote the input wires by wi and w%, and the output wire by W3 . The input to 
w\ is denoted by bi and is known to Pi, similarly P2 knows the input to w-2 and 
this is given by 62- Each gate has a unique identifier Gid; this enables a circuit 
fan out of greater than one, i.e., it enables the output wire of one gate to be used 
in more than one other gate. We require that P2 evaluates the gate on the two 
inputs, without Pi learning anything, and without P2 determining the value 61, 
bar what it can deduce from the output of the gate and its own input. We define 
the output of the gate by the function G(b\, 62) G {0, 1}. 

The construction of Yao works as follows. Pi encodes, or garbles, each wire 
Wi by selecting two different cryptographic keys k° and kj of length t. Here t is 
a computational security parameter which suffices for the length of a symmetric 
encryption scheme. A random permutation 7q of {0, 1} is associated to each wire. 
The garbled value of wire Wi is then represented by k bi \\ci, where c, = 77,; (6,;). 
We call the value c* the “external value” of the wire, note that this value is 
completely independent of the actual value of the wire bi. 

An encryption function E kl k2 (m) is selected which has as input two keys 
of length t, a message m, and some additional information s. The additional 
information s must be unique per invocation of the encryption function, i.e., it 
is used only once for any choice of keys. The gate itself is then replaced by a 
four entry table indexed by the values of c\ and C2, and given by 

ci,c 2 : (fc 3 G(6l ’ 62) ||c 3 ) , 

where c\ = 7Fi (61), C2 = 772(62); and c 3 = 7r 3 (G(6i, 62)). Each entry in the table 
corresponds to a combination of the values of the input wires and contains the 
encryption of the corresponding garbled output value. The resulting look up 
table, or set of look up tables in general, is called the “garbled circuit” . 

Player Pi then sends to P2 the garbled circuit, the key corresponding to its 
input value k bl , the value ci = tt\ (61 ), and the permutation 7 r 3 . The parties 
engage in an oblivious transfer (OT) protocol so that P2 learns the value of 
&2 2 |l c 2> where C2 = 772(62)- Player P2 can then decrypt the entry in the look up 
table indexed by (ci, C2) using k bl and revealing the value of k^ bl ’ b2 ^ ||c 3 . P2 
determines the value of G(6i, 62) by using the mapping 7 r 3 1 from c 3 to {0, 1}. 

In the general case the circuit consists of multiple gates. Player Pi chooses 
random garbled values for all wires and uses them for constructing tables for 
all gates. It sends these tables, i.e., the garbled circuit, to P2 and in addition 
provides P2 with the garbled values and the c values of Pi’s inputs, and with the 
permutations 77 used to encode the output wires of the circuit. Player P2 uses 
invocations of oblivious transfer to learn the garbled values and c values of its 
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own inputs to the circuit. Given these values, P 2 can evaluate the gates in the 
first level of the circuit, compute the garbled values and the c values of their 
output wires. Player P 2 can then continue with this process and compute the 
garbled values of all wires in the circuit. Finally P 2 uses the tt permutations of 
the output wires of the circuit to compute the real output values of the circuit. 
If P\ additionally requires some output from the circuit then this can be dealt 
with by standard mechanisms, as described in the full version. 

One could use more general gates than 2-to-l gates, such as n-to-m gates 
with 2" entries. However the optimisations we shall present in this paper are 
most effective when applied to 2-to-l gates. While we found that more general 
gates can improve the performance of a naive Yao circuit protocol, they actually 
decrease the performance of the optimisations. Hence the rest of this paper is 
restricted to 2-to-l gates. 

2.2 Required Implementation Details 

Having described the basic theoretical description of Yao’s protocol and its ex- 
tensions, we now present a number of implementation details which are needed 
to understand some of our optimisations. The basic implementation choice of 
the underlying encryption scheme to be used is the same as the implementation 
described in Q ■ 

Oblivious transfer: Unlike m we do not use the OT scheme of Hazay and 
LindeU (HL) 0. Instead we use the OT scheme of Peikert et al. (P VW) jul ] . 
This scheme is UC-secure and hence requires the setup of a Common Reference 
String (CRS) of a few hundred bits. For our experiments we assume that this is 
given to the parties. (Alternatively, the parties can run a coin-tossing protocol 
to generate the CRS, which is possible due to the nature of the CRS used in the 
PVW scheme.) The batched method of PVW is more efficient per OT than the 
batched method of HL, especially on the receiver’s side. In particular the CRS 
can be used for any number of invocations of the OT, whereas the method in HL 
requires the maximum number of OT’s being executed to be known before the 
setup is performed. (The setup in HL also requires two ZK-proofs as opposed to a 
CRS being created in PVW.) The OT stage is not our computational bottleneck, 
and is unlikely to be, unless one is in the rare situation of having a circuit with 
a large number of inputs for P 2 and yet a relatively small number of gates. 
Thus we do not consider optimisations of OT schemes which are secure against 
only semi-honest or covert adversaries, since the fully secure OT is efficient 
enough. 

Encryption scheme: The only implementation detail we will need from 0 
is that the encryption scheme is implemented via 

E s kuk Jm) = m®KDFK(fe 1 ,fc 2 ,s) 
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where KDF is a key deriviation function, whose \m\ bits of output are indepen- 
dent of the two input keys in isolation, and which depends on the value of s. We 
will instantiate this function as follow^ 

KDF*(jfei,fc 2 ,«) = HihWs)!.^® H(k 2 \\s) 1 ... e . 

Even if if is a Merkle-Damgard type hash function this will be secure (with the 
associated issues of length extension), since we are only applying the function to 
fixed length inputs. Indeed, in our experiments we implement H using SHA-256. 


Modeling the hash function, and correlation robustness: In this paper 
we need to model the underlying hash function H in two ways. In the first we 
make the usual assumption that it behaves as a pseudo-random function, namely 
that H(k\\s) is an invocation of a pseudo-random function keyed by k, with the 
input s. However one of our optimisations requires that we make a stronger 
assumption on the hash function, namely that it is correlation robust. This later 
property can be stated formaly as follows: 

Definition 1 (Correlation robustness 0)- An efficiently computable func- 
tion H : {0, 1}* — > {0, l} e is correlation robust if the following distribution is 
pseudo-random: (fi, . . . ,t m ,H(ti © r), . . . ,H(t m © r)), where t\,...,t m and r 
are chosen at random, and m is polynomial in the security parameter. 

This can also be stated by saying that the function f r (x) = H(x © r) is a 
weak pseudo-random function. The definition also implies that the distribution 
of (H(ti), . . . ,H(tm),H(ti © r), . . . ,H(t m © r)) is pseudo-random. 

The correlation-robustness assumption is satisfied by a random oracle (or 
rather by a very weak form of it: a non-programmable, non-extractable ran- 
dom oracle). However, assuming correlation robustness seems as a much weaker 
requirement than assuming the existance of random oracles. This assumption 
has been introduced in jig] and was used there for providing security against 
malicious adversaries for a method of extending oblivious transfer. The correla- 
tion robustness assumption has been recently used in the context of oblivious 
transfer mm and in the context of secure computation 00- 

For our construction, as we deal with circuits with arbitrary fan out, we re- 
quire a slightly modified definition. Namely that for any set S = {si, . . . , Sigi} 

2 In two instantiations were presented, depending on whether we are working in 
the random oracle model (ROM) or standard model, via truncating, or extending, 
the output of a suitable hash function H in the standard way as follows 


KDF*(fci,A;2,s) 


H(ii)M|s)i...f H is modeled as an RO, 

H(fcij|s)i...< © H(k 2 \\s)i...e H is modeled as a PRF. 


The difference is that the security analysis in the ROM works even if we feed related 
keys to different invocations of the function. Namely, it is possible to compute, say, 
H(ki Hfe), HfkiWk^fjHfk'x ||fe) and H(k'i\\k' 2 ) and claim that knowledge of ki, does 
not disclose information about any of the values except H(ki\\k 2 ). This is impossible 
in the standard model. Therefore if HQ is modeled as a prf it must be invoked 
separately with each key. 
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of size which is of the same order as the number gates, the distribution of 
(h, (H((t 1 © r) || si) , . . 0 r)||si))), (H{(h © r)||s 2 ), . . . , H{{t m © 

r)||s 2 ))), ■ • • , (H((h © r)||s|s|), . . . , H((t m © r)||s|s|))) is pseudo-random, where 
t \. . . ,t m and r are chosen at random. In other words, all the pads that are 
used for encrypting table entries are pseudo-random. If one is willing to assume 
this then our optimisations provide highly efficient protocols. We also provide 
optimisations for when the user is unwilling to make such an assumption. 

3 Structural Optimisations of the Circuit 

Yao’s protocol operates on functions which are described as a boolean circuit, and 
its overhead depends on the size of the circuit. A convenient way of generating 
a representation of a function in this form is to use a compiler which translates 
a description of a function in a high-level language to a description as a binary 
circuit. The Fairplay system provides a compiler for this task which operates on 
functions described in a high-level language called Secure Function Description 
Language (SFDL) 0, Hd| . We use that compiler as the basis of our experiments, 
but use our own run-time environment to execute the protocol. 

There are a number of general circuit simplifications which can be performed 
to the output of the Fairplay compiler. We have implemented a number of these, 
based on two basic ideas: (1) identifying component circuits which can be re- 
placed by simpler combinations of gates, and (2) identifying complicated compo- 
nents whose output must always be zero, or one; this allows for the component to 
be removed and other subsequent components to be further simplified. A com- 
bination of these techniques is surprisingly effective, and allows us to produce 
circuits which are often 60 percent more efficient than the circuit produced by 
the Fairplay compiler. 

Many of the techniques used are ad-hoc, but the following technique is partic- 
ularly effective. First, by a technique akin to common sub-expression elimination, 
we identify sets of gates which can be replaced by a single 3-to-l gate, and then 
replace the 3-to-l gate with a set of 2-to-l gates which was chosen to minimize 
the number of non-XOR gates. This is particularly effective when combined with 
our later technique of Section 0 in the case of correlation robust KDFs, to re- 
move the cost of any XOR gates; however the technique is also successful in 
the more general case as well. We call a gate even if its truth table has an even 
number of ‘T entries (for example, a XOR gate is even), otherwise it is called 
odd (an OR gate, for example, is odd). We show in the full version that it is 
possible to replace any 3-to-l even gate with at most a single 2-to-l non-XOR 
gate and at most three XOR gates. The optimal transformation rules, which we 
found by exhaustive search, are listed in the full version. 

4 Optimisations with Free XORs, When the KDF Is 
Correlation Robust 

In 0 Kolesnikov and Schneider present an optimisation based on the correlation 
robustness assumption, which allows XOR gates to be evaluated for free, thus 
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doing away with the need to evaluate or transmit the garbled tables for such 
gates. The optimisation requires that there is a global random value R of bit 
length t, known only to Pi, such that for all garbled wires wj,; it holds that 
kj = ® P. In other words, the garbling of the 1 value of a wire, is determined 

purely from XOR-ing the garbled 0 value with the value R. Note that a similar 
property holds for the external values of the wire: 7r*(l) = 7q(0) ® 1 . With this 
convention we have that a XOR gate can be implemented by simply XOR-ing 
together the two garbled input values, and the two external values. Namely, for 
a XOR gate mapping wires uq and W2 to wire uig , it holds that k% = ki CD k? and 
C3 = ci ® C2- For a full proof of this optimisation see 0- Note that 0 states 
the proof in the random oracle model, but it can be easily seen, as noted in 0, 
that the proof can be based on the correlation robustness assumption. 

Garbled Row Reduction — GRR: The above solution is ideal for XOR gates, 
but in addition we would like to reduce the size of the tables of the non-XOR 
gates as well. The following simple optimisation (which was pointed out in 0 ) 
provides a 25 percent reduction in the sizes of the tables needed to represent 
two-input gates. We can do this in a way which still allows the use of the above 
trick for free XOR gates. (In general, this method provides a 1 / 2 " reduction 
in the size of n-to-1 gates, but we will only describe it in detail for the two 
input case.) 

The observation is that instead of defining the two garbled values of the output 
wires randomly, we can define one of them as a function of garbled values of the 
two input wires which result in this output value. In other words, we choose an 
input pair (61,62) G { 0 , l} 2 , and define the garbled output value of G(6 i, 62) to 
be a function of the garbled values of 61 and 62. The gate table therefore need 
not store an entry for the input combination (61,62). In the evaluation phase, 
if the evaluator has the garbled values of the pair (61,62) it can compute the 
corresponding garbled output directly, without consulting the gate table. 

Suppose the gate maps wire w\ and wire to wire 11)3 . As before we let 
kf and kj denote the garbled wire values, G(6i , 62) denote the function being 
implemented by the gate, and we set the external value of the wire to be c, = 
7Tj(6j). We then define the garbled output value corresponding to the output 
resulting from the external input values (co, ci) = (0, 0) as 

fc GK- 1 (<W(o))|| C3 = KDF t+i (fci rl(0) ,A:2 2 " 1(O) ,Gid||0||0) . 

In other words, the garbled value is exactly equal to the pseudo-random mask 
that was used to hide it in the basic protocol. Note that this operation also 
defines the external value C3 of this output value. We therefore define 7r3 such 
that C3 = 7T3(G(7rf 1 (0),7r^“ 1 (0))). The other garbled value of the output wire, 
k 3 G( ' 7r ' is then chosen as in the free XOR method above, to enable the 

evaluation of XOR gates for free. The table is then constructed in the standard 
way except that we do not store, or transmit, its first entry. 

On evaluating the garbled gate the evaluator proceeds as in the standard 
algorithm except when it wishes to access the first entry of the table, i.e., when 
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the external values of both input wires are 0, namely ci = C2 = 0. In that 
case it possesses the garbled values k bl and k^ 2 , where b\ = 7rf 1 (0) and 62 = 
7r^“ 1 (0). It uses them to compute k^ bl ' b2> and C3 = 7T3(G(6 i, 62)), by computing 
KDF t+1 (kl 1 , k % 2 , 0 1 1 0 1 1 Gid^ as defined in the equation above. 

We will denote this optimisation as Garbled Row Reduction, GRR for short, 
in our future discussions. 

Security: We sketch why the above optimisation maintains security. Recall 
that the proof of security for Yao’s protocol given in 0 shows security against 
a corrupt P 2 based on a hybrid argument, and on a claim that for each gate it is 
infeasible to distinguish between a correct garbled table of this gate and a table 
which encrypts the same value in all four entries. In order for this argument to 
apply to the GRR optimisation, it is required to show that it is infeasible to find 
out if the garbled value assigned to the first table entry, is 

equal to the values encrypted in the other entries. However this value is equal to 
the mask that is used to encrypt the first entry in Yao’s original protocol, and 
we know that if a polynomial adversary is given only a single pair of garbled 
input values then the masks that are used for encrypting the other entries of the 
table are pseudo-random. Therefore the claim follows. 

5 Optimisations without Free Xors, When the KDF Is 
Not Correlation Robust 

One may not want to assume the KDF is correlation robust, or perhaps the 
proportion of XOR gates in the circuit is so low that making this assumption is 
not as effective. In these situations, too, we would like to reduce the overhead 
required by the Yao circuit. This section describes an optimisation which reduces 
the size of every two-input gate by 50%, but which, unfortunately, cannot be 
combined with the free XOR method of Section 0 

The underlying idea is that if we are not using the free XOR trick then the two 
values of the output wire can be chosen independently^ The 50% reduction in 
the size of the gate tables is based on Shamir secret sharing . It makes use of 
a finite field F 2 t. Recall that t is the bit length of the keys used to represent the 
garbled values of the wires. We can therefore interpret keys as elements of F2* 
and vice versa. We also interpret small integers such as 1, 2, 3 etc. as elements 
in F 2 t. For example if we think of F 2 * as F 2 [X]/(/(X)), for some polynomial of 
degree t, then the integer 3 can be interpreted as x + 1. 

As before we assume a garbled table indexed by the external values, ci and C2, 
and each entry corresponds to the value being output, on input of the values k bl 
and fcj 2 where bi = 7T,/ 1 (c;). We set the rows of the gate table to be numbered 

3 This allows for possible extensions of the GRR method, and in the full version we 
detail another optimisation method, which we call Garbled Table Reduction (GTR), 
which reduces the size for the garbled tables needed to represent odd 2-to-l gates 
by 1/3, and the size of tables of even 2-to-l gates by 1/2. 
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1 , . . . , 4, and therefore set r = 2ci + C2 + 1 to be the row number of table entry 
(ci, C2). We define the value used to mask this entry as 

K r \\M r = KDF t+1 {kl\kl 2 ,s) (1) 

where s = Gid||ci[|c2, K r is a bit string of length t bits and M r is a single bit 
used to mask the external value of the output. We use a different method for 
optimising odd and even gates. The truth table of each gate, and therefore also 
the information whether the gate is odd or even, is known to the circuit evaluator. 
Therefore it can compute each gate according to the right method. (The only 
information hidden from the evaluator is the values passing on intermediate wires 
of the circuit.) 

5.1 Odd 2-to-l Gates 

Suppose we are implementing an OR-gate, where the external values of c\ = 0 
and C2 = 0 correspond to the real input values (0,0), the other cases will follow 
immediately from the following. This means that the values r = 2,3 and 4 
should evaluate to the same output value k-\, whilst r = 1 should evaluate to the 
output value k°. We first define over F 2 » a polynomial P(X) of degree two, by 
interpolating the polynomial which intersects the three points (2,1^), (3 , -K3) 
and (4, K 4 ), where each K r value was defined according to equation (HJ. (This 
is the value which in the other constructions was used to mask entry r of the 
table.) The garbled output value k\ is defined to be k\ = -P(O). We also compute 
K§ = P(5) and Kg = P( 6). We then define a second polynomial Q(X), also of 
degree two, by interpolating the polynomial which intersects the three points 
(l,.Ki), (5, K5) and (6, K 6 ), where Ki was defined according to equation JTJ. 
The garbled output value is now defined by = Q( 0). The garbled table is 
replaced by the two values ( K 5 ,K 6 ). In addition, for each of the four original 
rows, the external value for the output wire in the rth row is encrypted using 
the bit M r , defined in equation ©• The total amount of data sent for the gate 
is therefore 2t + 4 bits. 

Player P2 then, given two key values fej 1 and k^ 2 plus two external values c\ 
and C 2 , computes, using equation 0U the value of K r and M r for r = 2ci + C 2 + 1. 
Recall that the evaluator knows r but not 61 or 62. It then uses the two supplied 
values of K 5 and K@ to interpolate the polynomial passing through the points 
(r,K r ),(5,K 5 ) and (6, K 6 ). The result is either Q(X) or P(X), depending on 
whether r = 1 or not. Player P 2 then recovers the associated secret value k ^ 3 , by 
evaluating the polynomial at the point X = 0. Using M r the evaluator can also 
decrypt the encryption of the external value of the output wire and so obtains 
C3. Hence the evaluator recovers the correct value of the output wire. 

5.2 Even 2-to-l Gates 

The only non-trivial even 2-to-l gates are the XOR and NXOR gate, since all 
other gates can be replaced by wires. Again let us assume the external input 
values ci = 0 and C2 = 0 correspond to the real input values (0,0), and assume 
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we are dealing with a XOR gate. Then the entries 1 and 4 in the standard garbled 
table will correspond to the same output key, namely k 3 3 Any other case 

will follow from the following description. 

Player Pi first creates a linear polynomial P(X) over F 2 t which interpolates 
the two points (l,Ri) and (4. K 4 ). The value of fcy 3 ^ is defined to be equal 
to P(0). If the external value of this output value is 0 then we store P( 5) into 
the first row of the new table of this gate, otherwise we store P( 5) as the second 
entry. Then Pi creates another linear polynomial Q(X) which interpolates the 
two points (2, A 2 ) and (3, A 3 ). The value of fcg 3 ^ is then defined to be <5(0), 
and the value Q( 5) is stored in the remaining row of our new table. The external 
values of the output wires are now encrypted and stored, using the M r values 
as before as a seperate sub-table of 4 bits in length. Thus, the total amount of 
data required to represent the gate is 2t + 4 bits. 

Player P 2 given two key values k\' and k ^ 2 plus two external values ci and 
c 2 , computes the value of K r and M r . Using M r it can determine the external 
value of the output wire. If this external value is zero then using the first entry 
of our garbled table and the value of K r , the evaluator recovers P(X) and hence 
P(0) = fcg 3 If the external value is one then using the second entry of the 
table and the value K r , the evaluator recovers Q(X) and hence Q(Q) = k ^ 3 < ' 1 \ 

Security: We sketch why the above optimisations maintain security. Given a 
pair of garbled values of the input wires, P 2 can compute a garbled output 
value, but cannot distinguish the other garbled output value from random. This 
is because that other garbled value is defined using a linear combination with a 
value which is unknown to P 2 . This fact can be used in a, somewhat modified, 
security proof in the spirit of the proof of Yao’s protocol in & 

6 Some Experimental Results 

We now present some experimental results. In our results we separate out pre- 
computation time, i.e., generating the required garbled circuits, from the rest 
of the computation. This is because it depends on the application whether one 
should consider this time as part of the computation time or not. 

There are two major conclusions of our experiments. Firstly, assuming the 
KDF is correlation robust then the GRR optimisation produces the most ef- 
ficient implementation. Secondly we conclude that rather large circuits can be 
practically evaluated using the methods described. Thus secure two-party com- 
putation has become more of a reality than one might previously have thought. 

Example 1 — Evaluation a Simple Circuit: First we present results for a 
simple circuit, where we took the circuit for which each of Pi and P 2 ’s input is 
a 32-bit integer. The output for P 2 should be the single bit resulting from the 
application of the comparison operator on the inputs. The output for Pi will be 
a six bit integer resulting from the scalar product of the bits of the two inputs, 
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Table 1. Experimental Results For Example 1 (Times are in seconds) 


Adv. 

Enc. 

Method 

No. 

Gates 

% XOR 
Gates 

Precomp Send OT Calc 
Time Time Time Time 

Total 

Time 

Total 

KBytes 

Semi- 


Base 

251 

11 

0 

0 

2 

0 

2 

46 

Honest 


PRF-SS 

537 

55 

0 

0 

1 

0 

1 

34 



CoR-GRR 

537 

55 

0 

0 

1 

0 

1 

22 



ROM-GRR 

537 

55 

0 

0 

1 

0 

1 

22 

Covert 

Indep. 

Base 

419 

38 

7 

1 

4 

6 

18 

1188 


Inputs 

PRF-SS 

705 

61 

8 

0 

2 

7 

17 

969 



CoR-GRR 

705 

61 

6 

1 

3 

5 

15 

682 



ROM-GRR 

705 

61 

1 

1 

2 

0 

4 

629 

Covert 

Random 

Base 

1247 

79 

9 

2 

4 

7 

22 

2275 


Comb. 

PRF-SS 

1535 

82 

9 

1 

3 

7 

20 

1646 



CoR-GRR 

1555 

82 

7 

1 

3 

5 

16 

682 



ROM-GRR 

1555 

82 

1 

1 

3 

0 

5 

629 

Malic. 

Indep. 

Base 

1571 

83 

171 

80 

47 

54 

352 

180599 


Inputs 

PRF-SS 

1857 

85 

175 

79 

39 

67 

360 

173942 



CoR-GRR 

1857 

85 

147 

78 

37 

39 

301 

164323 



ROM-GRR 

1857 

85 

141 

71 

37 

38 

287 

161741 

Malic. 

Random 

Base 

3029 

89 

163 

75 

19 

64 

321 

167276 


Comb. 

PRF-SS 

2799 

90 

161 

74 

16 

69 

320 

158904 



CoR-GRR 

2781 

90 

117 

75 

16 

39 

247 

140265 



ROM-GRR 

2802 

90 

117 

69 

16 

37 

239 

137609 


i.e. the number of ones in the string obtained from forming the bit-wise “and” 
of the two strings. 

Applying the Fairplay compiler to this functionality we obtain a circuit with 
689 gates. We produce two circuits from this output; the first, denoted C 2 . 3 , is 
to allow comparison with the existing state of the art, namely the methods of 
0 - This is a circuit which uses 2-to-l and 3-to-l gates and has 245 gates. The 
second circuit we use, denoted C xor , replaces, via the techniques of Section 0 all 
complex gates with 2-to-l gates, and tries to minimise the number of non-XOR 
gates in the circuit. This circut has 531 gates, 240 of which are non-XOR gates. 
An extra six gates are needed in each circuit so as to encode Pi’s for tranmission 
back to Pi , without P 2 learning the value. 

The above circuit sizes are purely to implement the functionality, they do 
not include the extra wires and gates required to transmit Pi’s output back 
to Pi (for details of how this is done see the full version), nor do they include 
the extension of the circuit to cope with Pa’s input in the case of Covert and 
Malicious adversaries. (We refer to the two methods for encoding P 2 ’s input as 
the independent i nput s and the random combinations methods. For the details of 
these methods see j20] or the full version. These methods add a set of XOR gates 
to the circuit, which transform P 2 ’s inputs using a random linear encoding.) The 
sizes of the extended circuits, and the resulting run-times are given in Table 0 
which measures the total elapsed wall times in seconds for the various cases. 
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The calculations were performed on two machines with Intel Core 2 Duo’s 
running at 3.0 GHz, with 4GB of RAM connected by a 1GB ethernet. The hash 
function HQ used in the protocol was implemented as SHA-256. 

The column of “Total KBytes” contains the total number of kilobytes of data 
which were transferred during the run of the protocol. The column “Method” 
details the type of computation used, as follows: 

— Base: Denotes the optimisations proposed in j2§|, extended to the case of 
Covert and Honest adversaries, which we use for comparison purposes, as 
our baseline implementation. This uses the C 2.3 circuit mentioned above, 
the KDF which is secure in the standard model, and the OT of Hazay- 
Lindell 0 as opposed to that of Peikert et al. 0 . 

— PRF-SS: This denotes using the secret sharing based method of Section 0 
to reduce the size of the garbled tables. For this the KDF is assumed to be 
a PRF, but not correlation robust. 

— CoR-GRR: This denotes an implementation which is only secure assuming 
the KDF is correlation robust. It uses the free XOR trick and the method of 
Garbled Row Reduction, from Section EJ to reduce the size of the remaining 
garbled tables. 

— ROM-GRR: As above for CoR-GRR but all hash functions used are modelled 
as random oracles. This means we can implement our KDF via a single hash 
function call, based on the method described in Footnote E3 

The column denoted “No. of gates” describes the number of gates, and the 
percentage of XOR gates, in the extended circuit (which transfers Pi’s outputs 
and applies the extension described in the full version, encoding P 2 J s input). 

For the Covert and Malicious cases the “Input Enc.” column denotes whether 
we use the Independent Inputs technique or the Random Combinations technique 
for the extended circuit construction. See the full version for details. From the 
table we can deduce the following conclusions: 

— The running time in the semi-honest setting is about 10-20 times faster than 
in the covert setting, which is in turn about 15-20 times faster than in the 
malicious setting. 

— A lot of the extra data needed to be transmitted in the Malicious case is 
related to the large number of commitments and decommitments which need 
to be transmitted. Thus our optimisation techniques are less effective in 
the Malicious case. This points to a clear direction for future research in 
optimising the Malicious case. 

— If one is not willing to assume that the KDF is correlation robust we see 
that using our technique based on secret sharing can reduce the amount of 
data being transmitted, compared to the base scheme, without increasing 
the computational cost. 

— In all cases we see that the correlation robust variant using Garbled-Row- 
Reduction is the most efficient variant. The extra efficiency comes from the 
free XOR’s which reduce both the number of encryption/decryptions which 
need to be performed and also the amount of data needing to be transmitted. 
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— Note that if we assume the random oracle model, and so could implement our 
KDF via a single hash function call then for Covert adversaries the protocols 
run significantly faster. That this does not apply as much to the Malicious 
case is due to the fact that most of the time in the Malicious case is spent 
with creating, sending and verifying the various commitments. 

We pause to compare our two optimisations with the optimisation in bandwidth 
suggested in 0 . In our system Pi, the circuit constructor, sends commitments 
to all circuits that it constructs and to its own i nput s, and a random subset 
of these committed values are checked by P 2 . In Jjjj| it is suggested that Pi 
commits to a random seed, and uses this to generate the circuit. Then only 
the commitment to this seed, and eventually its decommitment, need to be 
transmitted. This means that P 2 needs to compute the circuit given the seed. 
Whilst this optimisation clearly significantly reduces the consumed bandwidth, it 
actually leads to a significant increase in the time needed to perform the protocol. 
To see this consider our Covert experiments in Table Q The optimisation in 0 
would reduce practically to zero, the entry for the “Send Time” column, but P 2 
would now need to recompute almost all of the calculations in the “Precomp 
Time” column. Thus the technique of 0 is only to be compared to ours in the 
situation where bandwidth is very expensive and CPU time is very cheap. 

Before passing onto our larger example we note the following. If we let p 
denote the proportion of XOR gates within a circuit, and we let N denote the 
amount of data needed to be sent per circuit in the standard Yao construction, 
then the average amount of data needed to be sent per circuit gate when using 
the free XOR gates and GRR methods is 3/4 • (1 — p) ■ N. Whereas if we do 
not use the free XOR gate method and instead use the method based on secret 
sharing, this value becomes Nf 2. Hence, if we are willing to assume correlation 
robust KDFs, then the method which uses secret sharing and does not use the 
free XOR method, will be more efficient as long as the fraction of XOR gates, p , 
is smaller than 1/3. However as can be seen from the column entitled “% XOR 
Gates”, this proportion is generally much larger than 1/3, especially in the case 
of Covert and Malicious adversaries where we have had to extend the circuit 
by a large linear component. This expansion is performed to cope with possible 
adversarial behaviour related to P 2 S input, see the full version for details. One 
should note that these theoretical estimates of bandwidth are never achieved 
fully in practice due to overheads in the underlying data transmission mechanism 
and the fact that they assume a bit-oriented communication mechanism, whereas 
practical communication is performed in bytes. Hence the saving we achieve in 
gate transmission is about 5-10% less than one would predict purely by theory. 

Example 2 - Evaluating AES: As our second example we created a circuit 
which computes an AES encryption of a single 128-bit block with respect to 
a 128-bit key. Here Pi’s input is the secret key, and P^s input is the message 
block. We require that P 2 learns the encryption of its message under Pi ’s secret 
key, and that Pi learns nothing. Compiling such a circuit using the Fairplay 
compiler, and applying various optimisations, resulted in a circuit, which we 
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Table 2. Experimental Results for Example 2 (Again times are in seconds) 


Adv. 

Enc. 

Method 

No. 

Gates 

% XOR 
Gates 

Precomp Send OT Calc 
Time Time Time Time 

Total 

Time 

Total 

KBytes 

Semi- 


Base 

28216 

56 

5 

2 

4 

3 

14 

3162 

Honest 


PRF-SS 

33880 

66 

5 

1 

3 

3 

12 

1752 



CoR-GRR 

33880 

66 

2 

1 

2 

2 

7 

503 



ROM- GRR 

33880 

66 

1 

1 

3 

2 

7 

503 

Covert 

Indep. 

Base 

28600 

56 

96 

47 

18 

45 

206 

51899 


Inputs 

PRF-SS 

34264 

67 

92 

36 

13 

50 

191 

29380 



CoR-GRR 

34264 

67 

40 

21 

11 

23 

95 

9078 



ROM- GRR 

34264 

67 

22 

21 

11 

6 

60 

8942 

Malic. 

Random 

Base 

40253 

69 

1250 

448 

39 

887 

2624 

987442 


Comb. 

PRF-SS 

45944 

75 

1184 

392 

34 

829 

2439 

711729 



CoR-GRR 

45960 

75 

483 

270 

34 

361 

1148 

406010 



ROM- GRR 

45881 

75 

453 

276 

35 

350 

1114 

417907 


denote by C^\ with 33880 gates, where each gate is a 2-to-l gate. This circuit 
was derived in a way to try to minimize the number of non-XOR gates. Again, 
we stress, the above circuit size purely implements the AES functionality, it 
does not include the extension of the circuit to cope with Pa’s input in the case 
of Covert and Malicious adversaries. Note that the key schedule takes up only 
about 15% of the circuit, hence encrypting a sequence of message blocks as in 
CBC-Mode encryption will scale almost linearly with respect to our data. 

We repeated our experiments from above, but in Table 0wc only present the 
times for the most efficient choice for the input encoding. 

We conclude that performing the Yao protocol is certainly feasible on compli- 
cated functionalities such as AES encryption. For the case of honest and covert 
adversaries we again see that the computation and bandwidth consumed, when 
we use correlation robust KDFs and the GRR method, greatly reduces in com- 
parison to the base case. If one is not willing to assume correlation robust KDFs 
(or use the ROM) then our secret sharing based optimisation greatly reduces 
the bandwidth without affecting the run times. For the malicious case the im- 
provement in the secret sharing based version is less pronounced due to the large 
number of commitments which need to be transmitted and opened. This clearly 
points to the place where future optimisation research needs to be performed, 
namely in reducing the number of commitments needed in the situation of ma- 
licious adversaries. However even without such future optimisation we note that 
performance can be significantly reduced by taking advantage of the inherent 
parallelism in the algorithm in the Malicious case (in which Pi generates many 
commitments and P 2 verifies a subset of them). For web service or cloud com- 
puting applications, where server farms are common place, an improvement in 
computational time by a factor around si could be expected. 

We end by noting that many application domains of a secure evaluation 
of AES, for example the one-time program example from 0, require only 
security against semi-honest adversaries. Hence, such applications are already 
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within the reach of practical realisation. Furthermore, this application requires 
no computation of the OT or data to be sent. Thus the party generating the 
one-time-program will take the time needed in our Precomp Time column, and 
the evaluator (after querying the one-time-memory) will take the time needed 
in the Calc Time column. 
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Abstract. Multi-party secure computations are general important procedures to 
compute any function while keeping the security of private inputs. In this work 
we ask whether preprocessing can allow low latency (that is, small round) secure 
multi-party protocols that are universally-composable (UC). In particular, we al- 
low any polynomial time preprocessing as long as it is independent of the exact 
circuit and actual inputs of the specific instance problem to solve, with only a 
bound k on the number of gates in the circuits known. 

To address the question, we first define the model of “Multi-Party Computa- 
tion on Encrypted Data” (mp-CED), implicitly described in IFH96I toHni I cdnO ll 
IdnO.II . In this model, computing parties establish a threshold public key in a pre- 
processing stage, and only then private data, encrypted under the shared public key, 
is revealed. The computing parties then get the computational circuit they agree 
upon and evaluate the circuit on the encrypted data. The MP-CED model is inter- 
esting since it is well suited for modem computing environments, where many 
repeated computations on overlapping data are performed. 

We present two different round-efficient protocols in this model: 

- The first protocol generates k garbled gates in the preprocessing stage and 
requires only two (online) rounds. 

- The second protocol generates a garbled universal circuit of size 0(k log k) 
in the preprocessing stage, and requires only one (online) round (i.e., an 
obvious lower bound), and therefore it can run asynchronously. 

Both protocols are secure against an active, static adversary controlling any num- 
ber of parties. When the fraction of parties the adversary can corrupt is less than 
half, the adversary cannot force the protocols to abort. 

The mp-CED model is closely related to the general Multi-Party Computation 
(MPC) model and, in fact, both can be reduced to each other. The first (resp. sec- 
ond) protocol above naturally gives protocols for three-round (resp. two-round) 
universally composable MPC secure against active, static adversary controlling 
any number of parties (with preprocessing). 

Keywords: Computing with Encrypted Data, Multi-Party Computation, Public 
key Cryptography, Cryptographic Protocols, Universal Composition. 
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1 Introduction 

Secure Multi-party Computation (MPC). Protocols for mpc enable a set of parties 
to correctly evaluate a function such that no information about the private inputs of the 
parties is revealed, beyond what is leaked by the output of the function. This notion was 
first presented by Yao II Y 8 fill for the two-party case, and by Goldreich et al. IIGMW87I 
for the multi-party case. However, implementations for mpc are notoriously ineffi- 
cient. Many protocols implementing them have delays associated with the depth of the 
circuit and even constant round protocols produce very long delays. The question that 
we want to settle in this work is whether one can use preprocessing computation in order 
to “be ready” once the inputs and the actual circuit (problem) to compute on are given. 
Note that the world of computing is transforming into “cloud services” where parties 
can “rent” computational resources. Thus, it may make sense to perform a lengthy pre- 
processing in the background, with no specific input and problem to solve in mind, just 
as a preparation. To this end cloud resources can be employed on behalf of users, and 
massive computations and communication can be performed. Then in the online stage 
once the input is given and the circuit determined, it can be performed much faster given 
the preprocessing. As long as at least one of the servers in the cloud is not corrupted, 
the correctness and privacy of the online stage computation is guaranteed. 

We consider the following variation on secure multi-party computation, called multi- 
party computing with encrypted data (mp-ced): (1) The computing parties publish a 
shared public key, and hold shares of the matching private key. (2) The parties also know 
some bound on the circuit size that they will be required to compute securely. The par- 
ties then perform a preprocessing stage. For this stage too, we may try to minimize the 
parties’ work and computation rounds, but this is not the main goal, which is the effi- 
ciency of the on-line stage. (3) The input distribution is a database of encrypted data that 
can be published by many parties (not necessarily those taking part in the computation); 
i.e., think about the parties as a service (like the census bureau) computing on behalf 
of a larger population. (4) The concrete computation circuit (or circuits) is given, and 
the input to use from the database (their indices in the database) are determined. Then 
and only then (5) the parties are engaged in a short computation to achieve the task and 
produce the output while protecting the private data. Note that the input database may 
be reused for many computations. 

We remark that our model is somewhat related to a multi-party extension of the 
model by Rivest, Adleman and Dertouzos Hr at~)7 SB . They put forth a scenario for secure 
computation over database of encrypted data, called Computing with Encrypted Data 
(CED). This model is highly attractive since it represents the case where a database is 
first collected and maintained and only later a computation on it is decided upon and 
executed (e.g. , data mining and statistical database computation done over the encrypted 
database). We discuss the encrypted data model and the multi-party version here, and in 
fact show that mp-ced and mpc can be reduced to each other (shown in Section POl . 

1.1 Motivation 

We consider protocols in the universal composability (UC) framework introduced by 
Canetti IcOll . UC secure protocols remain secure even when executed concurrently 
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with arbitrary other protocols running in some larger network, and can be used as sub- 
routines of larger protocols in a modular fashion. 

Round-Efficient Protocols with Preprocessing. Round complexity is an important 
criterion for the efficiency of an MPC protocol. A long line of work, including 
IIbmr9()1 IikOOI IgikrOII Idi()5I Idi()61 Idik + 08I . focused on reducing both the round 
complexity and communication complexity. 

Also, it is known that UC secure computation of general functions is not possible 
in the plain model in the case of honest minority. In particular, UC secure two-party 
computation of a wide class of functionalities was ruled out by IIcfOII Ickt.0311 . To 
circumvent these impossibility results, it is common to assume some pre-computation 
setup, and the most common assumption is that a common reference string (CRS) is 
made available to the parties before the computation. Canetti et al. IIct,os()2II showed 
that (under suitable cryptographic assumptions) a CRS suffices for UC secure mpc of 
any well-formed functionality. 

In our work, we consider stronger relaxation on the setup, called general preprocess- 
ing HPl0-5fl in which the parties perform some work as long as it is independent of the 
inputs and the circuit for which the actual computation is to be done later. The main 
motivation for this model is to reduce the amount of work during the execution of the 
protocol beyond a preprocessing phase. 

Considering the two aspects above, we ask the following natural question: 

Allowing any polynomial time preprocessing ( in some input parameter ) before 
the circuit (whose size is bound by the same input parameter ) and the inputs 
are known, is there a very small constant round protocol? 


1.2 Our Results 

We address the aforementioned question affirmatively by constructing two different 
round-efficient protocols for mp-ced, which we call V\ and TV Both protocols can be 
naturally transformed into round-efficient protocols for MPC (c.f. Section fOt . Each 
protocol has its own advantage depending on the following parameters: 

1. round complexity in the online stage (our major concern), 

2. round complexity in the preprocessing stage, and 

3. the number of gates constructed throughout the protocol. 

In terms of online round complexity, protocol V\ is “two rounds” whereas that of pro- 
tocol P 2 is “one round” (which is optimal, since even non-secure computation need to 
collect the data and it takes one round). There are some cases, however, in which the 
preprocessing round complexity of V\ is better, under some efficiency considerations. 
We use general constant-round mpc protocols BIPS08II for the preprocessing stage in 
TV whereas in V\ we can use the protocol given in AppendixEl which requires exactly 
2 n rounds. When n is small enough, preprocessing in V\ can be more round-efficient 

1 Preprocessing in is independent only of the inputs (it depends on the circuit to be 

evaluated), whereas we require preprocessing to be independent both of the circuit and of the 
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(when n is large, a general mpc protocol can be used in Pi, too). Also, the number of 
gates constructed in P 2 is larger than that in V\. To evaluate a circuit with up to k gates. 
Pi constructs k garbled gates in the preprocessing stage, as explained below. In con- 
trast, P 2 generates a universal circuit Ilv76l in the preprocessing stage, which is later 
used (in the online stage) to evaluate a given circuit. The smallest known universal cir- 
cuit that can evaluate a circuit with k gates has 0(k log k) gates IksOXI . We overview 
the two protocols in the following. 

First Protocol (Pi). In a big picture, we follow the framework of Yao’s garbled 
circuit technique. However, the main difference is that, in our protocol, garbling is done 
on the individual gate level so that this procedure can be executed in the preprocessing 
level independently of the circuit to be given and computed later. In the online stage, 
construction of wires between gates according to the given circuit is performed. 

- In the preprocessing stage, the parties generate a ‘garbled’ truth table for each in- 
dividual gate. Truth tables are for NAND gates, and they have four rows and three 
columns - left-input, right-input, and output. Each row is randomly shuffled, and 
each element is an encryption of Boolean value. We emphasize that no party knows 
anything more than the fact that it’s a randomly shuffled encrypted table for NAND. 

In addition, a fresh pair of public key and (encrypted) private key is generated 
for each row. This key is used for constructing encrypted wiring information in the 
online stage, when the circuit is given. 

- In the online stage, given the encrypted data and a circuit, the computing parties 
‘connect’ truth tables by adding wiring information. The wiring information tells, 
given two tables T pre< i,T succ according to the topology of the circuit, which row 
of T prec i’s output column is equal to which row of T succ ’s input column. We note 
that this information should be carefully revealed; otherwise, the adversary may try 
computing different rows of the truth tables using the wirings, and may learn more 
than is allowed. In fact, during the computation (online stage), exactly one row’s 
wirings for each table should be revealed. 

To enable such wirings we introduce Multi-Party Conditional Oblivious Decryp- 
tion Exposure (M-CODE) (in Secti on Ejl, wh ich is a multi-party extension to the 
CODE functionality, introduced in Icfj+071 for the two party case. m-CODE as- 
sumes a group of parties share a secret key a; of a public key y. Three ciphertexts 
c ou ti Cm, Ckey — all encrypted under y — and a new public key z are given as input. 
For i G {out, in, key}, let me be the plaintext encrypted in c/ ; . If rn out equals TOj n , 
M-CODE outputs E z (mkey )• Otherwise, m-CODE chooses a random value r and out- 
puts E z (r). The computing parties use m-CODE such that, for each row of a truth 
table, the three ciphertexts of the m-CODE are (1) output value of the previous table 
(2) the input value of this row and (3) the secret key for this row. We refer the reader 
to Section mi for more details. 

With two round implementation of M-CODE for ElGamal encryption, we obtain 
a two-round protocol for mp-ced and a three-round protocol for mpc. 

Theorem 1. Assuming the DDH assumption holds, protocol Pi is a two-round UC 
secure protocol for MP-CED in the T z k hybrid — and, thereby three-round UC secure 
protocol for MPC in the T z k hybrid in the general preprocessing model — against an 
active and static adversary as long as at most t < n computing parties are corrupted. 
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The protocols manipulate linear number of gates in the circuit size. Furthermore, if 
t < n / 2 parties are corrupted, Vi is robust against abort 

Second Protocol (Vz), Protocol V 2 follows Yao’s garbled technique more closely than 
Vi . However, the circuit that is to be garbled is a universal circuit IIv7(SUks(T%11 to main- 
tain independence of the circuit to be given. Optimal round complexity in the online 
stage is achieved by putting a simple constraint on the input-layer labels in the garbled 
circuit and by employing the multiplicative homomorphism of ElGamal encryption. As 
in the first protocol, a group of parties share a secret key a; of a public key y. 

- In the preprocessing stage, the parties generate a garbled circuit frm of a universal 
circuit Cu, with some special restrictions on keys of input wires. In the garbled 
circuit Cu, there are two keys w l 0 and w\ for each wire i, where w b corresponds to the 
wire carrying bit b (see Section fO for more detail). The special restriction on input 
wires is that w\ /w l Q = h for a random global value h unknown to any party. The two 
keys can be constructed by picking uf uniformly at random and letting w\ = h-w l Q . 
In addition to the garbled circuit of Cu, the following encryptions are generated: 
(1) the encryption E y (h) and (2) E y (w}f) for each input wire i. Construction of a 
garbled circuit along with aforementioned encryptions — i.e., E y (h ) and E y (ufys 
— can be performed using a constant-round UC secure protocols for general mpc 
IIkosO/SI ItpsOXI . Input contribution of a bit 0 is done by E y (h°), and for a bit 1, 
re-encrypted E y (h}) is used via homomorphism. 

- In the online stage, for each input wire i where a bit b is the contributed input for the 
wire, computing parties obtain uf . The encryption E y (w\) can be obtained via homo- 
morphism given the encrypted input c, = E y (h b ), giving E y {w}f)-Ci = E y (vj l 0 h b ) = 
E y ( w\ ) , since w\ = h -w{ } . Now parties obtains the key w\ for each input wire i using 
threshold decryption and can locally evaluate the garbled circuit. Note that u>l does 
not leak any information on b since it’s randomly distributed (with w\_ b hidden). 

Theorem 2. Assuming the DDH assumption holds, protocol V 2 is a one-round UC 
secure protocol for MP-CED in the T z k hybrid - and, thereby two-round UC secure 
protocol for MPC in the T z k hybrid in the general preprocessing model — against an 
active and static adversary as long as at most t < n computing parties are corrupted. 
The protocol processes k log k gates where k is the circuit size. Furthermore, ift < n/2 
parties are corrupted, V 2 is robust against abort. 

1.3 Related Work 

Round Complexity. Beaver et al. IIBMR90II showed the first mpc protocol that required 
constant (but large) number of rounds, and Damgard and Ishai ImfiSll presented the first 
adaptively UC secure protocol that achieves two rounds in the (linear) preprocessing 
model when the number of malicious parties t <n/ 5 and some higher constant rounds 
when t < n/2. Recently, Ishai et al. constructed UC secure protocol with malicious 
majority in the OT hybrid model running in (large) constant rounds IITPS08I (see FigUJ. 

2 Instantiation of protocol Vi (in particular, key setup in the preprocessing) is parameterized by 
t. Therefore protocol Pi is not a ’best-of-both- worlds’ protocol IIIKLP06II . This is true of P 2 , 
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Fig. 1. UC Secure Constant-Round mpc Protocols (Left) and mp-CED Protocols (Right). 

We denote by d the depth of a given circuit, by n the number of parties, and by I, the number 
of corrupted parties. Vi and V2 denote the protocols proposed here. Here the column ’rounds’ 
means the number of rounds in the online stage. 


For the two-party setting, which is a special case of mpc, Katz and Ostrovsky IIKO04I 
showed that it’s impossible to construct a secure protocol running in four rounds us- 
ing enhanced trapdoor permutation (eTDP) or homomorphic encryption in a black-box 
manner in the plain model, and they constructed a five-round protocol. To overcome this 
lower bound, Horvitz and Katz IIhk()7I used CRS to construct a UC secure two-party 
protocol in two rounds. Nielsen and Orlandi IIno()9B gave a two party protocol using a 
cut-and-choose approach. In a big picture, their idea is somewhat similar to ours: after 
many garbled gates are generated, they are connected to each other according to the 
circuit to be evaluated. 

In the (non-UC) stand-alone setting, the work of IlKOOl IAIK05I gave a general 
non-interactive reduction of any n-party functionality computed by a polynomial size 
Boolean circuit into a (possibly randomized) functionality of degree-3 over GF( 2). 
Combining this reduction with any secure protocol with malicious majority (for exam- 
ple, IIGMW87I ) leads to round-efficient protocols in the stand-alone setting. 

MP-CED. Some nontrivial instantiations for CED were shown, originating with Sander 
et al. IISYY99I . who gave a protocol for circuits in NC 1 . Beaver 1b () (ill extended this 
result to accommodate any function in NLOGSPACE llBL9bll . Recently, Gentry pre- 
sented a construction for any polynomial size circuit by showing doubly-homomorphic 
encryption scheme from ideal lattices EED, however it is not yet clear if this can give 
efficient protocols for mp-ced (see discussion in Section l-Oli . 

mp-ced was also considered by Franklin and Haber IIfh96I and the subsequent 
works IIttOOI IcdnOII Idn()3I . In their works, after a threshold encryption key is es- 
tablished, each party broadcasts the encryption of its input, and the parties evaluate 
the circuit on the encrypted data. However, they do not explicitly treat the setting as 
a unique model for mp-ced, with a specific setup state that is independent of the in- 
puts and the circuits to be computed, and do not consider input separation - inputs can 
be contributed by parties that do not take part in the computation. Note that all these 
previous works in the model dealt with the two party case, which we extend herein to 
the multi-party case. 


274 


S.G. Choi et al. 


The protocol given by Cramer et al. BcdnOIB computes an arithmetic circuit and 
achieves security in the case of honest majority, but the number of rounds is linear in 
the depth of the circuit. A UC adaptively secure protocol with the same round complex- 
ity was given by IdnOHI . Jackobson and Juels II.t.tOOI use mix-and-match approach to 
compute on encrypted data, but their approach requires even more rounds (linear in the 
sum of the depth of the circuit and the number of parties). Figure [I] lists these previous 
works, in some relations to our protocols (while concentrating on on-line rounds, and 
omitting some of the advantages our results has beyond the table). 

2 Preliminaries 

For any integer t, let [t] — {0, 1, . . . , t — 1}. Let k be a security parameter. We choose 
a cyclic group Q | of order q « 2 k with a generator g where the DDH problem IIDH76I 
is hard. For example, Q | can be a subgroup of order q of a multiplicative group Z* for 
a safe prime p = 2q + 1, i.e., Q | = {g°, g 1 ,..., p 9-1 } (mod p). We assume £ 9 is 
known in advance. 

ElGamal Encryption. ElGamal encryption He 8 511 is semantically secure under the 
DDH assumption over (? 9 Ity98I . The key generation algorithm generates a public/ 
secret key pair ( y , x) where x £r [q] and y = g x . Encryption of a message rn £ £/| 
under a public key y, denoted by E y (m), is (g r , my r ) where r £ a [q]. Decryption of a 
ciphertext c = (a, / 3 ) with the secret key x, denoted by D x (c), is (3 / a x . 
Homomorphism. Multiplication of two ciphertexts E y (rn{) = ( g ri ,miy ri ) and 
E y (rnf) = (g 1 " 2 . rn-i'y 7 " 2 ) is defined as (g ri +r2 , TO1TO2 y ri +r ' 2 ) , which shows the ho- 
momorphism of ElGamal encryption (i.e., E y (rni) ■ Eyirn-i) = E y {rn\ ■ m2)). In ad- 
dition, encryption keys are also homomorphic in the sense that given key pairs {(yi = 
g Xi , Xi)}i, the pair y t , J2i x i) ' s a valid key pair. When two ciphertexts encrypt the 
same message, we denote c\ = c^. 

Zero-Knowledge Proofs of Knowledge (ZK-PoK). A proof of knowledge is a proof 
for a relation R, in which the prover convinces the verifier that an instance is in the 
language, and also that the prover knows a witness for this instance. We will use 
standard notation to denote proofs of knowledge related to discrete log. For example, 
PK{b : a = g b } denotes a proof of knowledge where the prover convinces the verifier 
that she knows the value of b, such that a = g b , when a is known to both. 

In the common reference string (CRS) model, we can us e non-interactive zero- 
knowledge proofs (NIZK) due to De Santis et al. Hsco+OH (see the discussion in 
He LOS 021 Section 6]) which is UC-secure em . In the random oracle model (ROM), 
the above proof systems can be efficient NIZK using the standard Fiat-Shamir technique 
Ill s S Oil combined with OR proofs of A'-protocols lie ds 9 41 . 

Secret Sharing IIs79LIf 871 . A secret sharing scheme allows a secret s £ [q] to be shared 
among n parties, such that a threshold of t + 1 parties can recover the secret, whereas 
any smaller set of parties can not learn anything about the secret. In Shamir’s secret 
sharing scheme, the shares are values of a degree-/ polynomial, and the secret is the 
free coefficient of the polynomial. 
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We show below how the parties can share and recover the secret s. Moreover, the 
parties may choose to recover d s for some d £ Q%, or an ElGamal encryption of d. s 
(without learning anything about the secret s). 

- Sharing: A dealer chooses at random a degree t polynomial Q(x) := s+a\x-\ H 

atx f , where the free coefficient is the secret s. The share of party Pi is Sj = Q(i). 

- Recovering s: Let T be a set of t + 1 parties. They evaluate Q'( 0) = Ylier s iLi(Q) 
to recover s, where Li is a Lagrangian on the points in T0 

- Recovering an exponentiation d s : Similar to above, the parties can evaluate d s = 

d Q'(0) = d J: iGT Si L ( ( o) = d -.£*(0) f using only {d«*} i6T . 

- Recovering E y {d s ): Using multiplicative homomorphism of ElGamal, the parties 
evaluate E y {d s ) = E y {d Q ’^) = H mr E y (d SiL *W) = FUt E y (d s ') Ld °\ using 
only {E y (d Si )} ieT . 

Multi-party Conditional Oblivious Decryption Exposure (M-CODE). We introduce 
Multi-Party Conditional Oblivious Decryption Exposure (m-CODE). m-CODE assumes 
a group of parties share a secret key £ of a public key y. Three ciphertexts c out , Ci n , Ck ey 

- all encrypted under y — and a new public key z are given as input. For £ 6 
{out, in, key}, let me be the plaintext encrypted in a. If m out equals m,; n , m-CODE 
outputs E z (m key ) • Otherwise, m-CODE chooses a random value r and outputs E z (r). A 
variant of this functionality for the two party case was initially introduced by Bcej + 07H . 
The intuitive idea is to generate a ciphertext that encrypts mk ey multiplied by ( m ou t / 
tni n ) r for a random r. If nii rl = m ou t, then the output would be mkey We assume party 
Pi has x i, all the parties know c nut , c zn . Ck ey , z, (y, y\ = g Xl , ... ,y n = g Xn ), and let 
c 0 «t = E y {m out ) = {a,f3),a n = E y {m in ) = (y,5), c key = E y (m ke y) = (A ,p). The 
protocol for M-CODE proceeds as follows: 

1. Each party Pi chooses e z Gr [q], and computes e, = (a/y) ei , Q = (}3/d) ei , tt,; = 
PK{ei : ei = (a/ 7 ) ei , and Ct = (f3/6) ei }„ and broadcasts {ei,Q,iri). 

2. Let e = riteSi e * anc * C = rijcSj G where .S'i is the set of parties which sent 
valid messages. Each party P t chooses r, randomly and computes d z = (dn , da) = 
E z ((eA) Xi ) and = PK^{ri,Xi) : dn = g ri , d& = z ri (e A) Xi , yi = g Xi ^, 
and broadcasts (d z , Gi)- 

3. Let S2 be the set of parties that sent valid messages in steps 1 & 2. If | S2 \ < t, 
then the protocol aborts. Each party Pi, using the homomorphic multiplication, com- 
putes d = (d-i . d'i) = E z ((eA) x ) = n, e s 2 d^ E °' where Lj{-) is a Lagrangian 
on the indices in S2. Pi uses homomorphic operations to compute E z (m key ) = 
{l/d\,C,p/d2), which is 

E, («(.*)■) = E, {[^)‘ ■ (m/A*)) = E, ((^ ■ m k , u ) , 
where e = JGieS! e »- 


3 Lagrangian Li on the points in T is a degree t polynomial such that Li{x) ** 1 if x — i 
and Li (x) = 0 if x € T and x ^ i. The polynomial Q'{x) = JL ST SiLi(x) is a degree t 
polynomial that goes through the points (i, Si)igT, and thus must be Q(x). 
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3 Multi-party Computing with Encrypted Data 

We assume the circuit C of interest is normalized: all intermediate gates are NAND 
gates, and output gates are IDENTITY gatefi We can easily attain this circuit by adding 
another layer of IDENTITY gates on top of a circuit that consists of NAND gates. 

3.1 First Protocol (Pi) 

In the first protocol, called Pi, each gate is garbled, and then the computing parties 
‘connect’ gates by adding wiring information using m-CODE. 

Preprocessing Stage. The first step is to establish a global public key y for ElGamal 
encryption. The computing parties have shares of the corresponding secret key x. Once 
the public key is established, the next step is to generate truth tables for individual gates. 
The columns of input, output, and intermediate gates differ slightly, as can be seen in 
Figure |2]which shows the structure of truth tables. 

1 . Input and Output. These are encrypted with the global public key y . 

2. Placeholders for the wiring information. This connects a row of the truth table to 
matching rows in successor gates. 

3. The columns PK and SK contain a random ElGamal key pair, where the private key 
is encrypted under the global public key y (and the wiring information is encrypted 
using the secret keys in SK). 

4. For output gates, ciphertexts in column Final encrypt the same plaintexts as cipher- 
texts in column In. 

During the preprocessing stage, the parties can generate polynomial number of gar- 
bled gates, that can later be used for evaluating circuits. Therefore it suffices to know a 
bound on the sizes of circuits to be evaluated later. Preprocessing can be done in con- 
stant number of round using general MPC protocols IIKOS031 IIPS08I . If the number 
of computing parties is small, it can be done explicitly in 2 n rounds, where n is the 
number of computing parties, using the protocol in AppendixEl 

Input contribution is performed by publishing a ciphertext c = (ci, C2) = E y {g b ) 
for an input b € {0, 1}. This can be done securely by adding PK {r : (ci = g r , c% = 
y r ) or (ci =g r , c 2 = gy r )}. 

Online Stage: Generation of Wires Between Garbled Gates. In Figure El G; is the 

left predecessor of G/,-. The connection between the two gates should be established 
through some “wiring” such that during the computation the output of G,; can be prop- 
agated to the left input of G&. So, rows of T, with output value b € {0, 1} should be 
connected to rows of Tk with left input value b. 

Requirements for Wiring. In our protocol, the following conditions are considered in 
generating wires. 


An IDENTITY gate has single input bit (wire) and output bit, and it copies the input bit value 
to its output. 
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Fig. 2. Garbled Truth Tables for the Gates (Gi, Gj, Gk, Ge). The topology of the gates is given 
on the right. Gj is an input gate, Ge is an output gate, and Gi, Gk are intermediate gates. Table T x 
is the truth table describing gate G x .y is the global public key. Each row of an intermediate truth 
table has two sets of (secret, public) keys, and contains the wiring information, “connecting” it to 
the next gate, encrypted using these two keys. 77 [0] and 77[1] are E y (g°) and E y (g 1 ) respectively. 
In table T), pkn = pknL -pknR, and pka , . . . ; pku are defined similarly. In the Wires columns, 
E(a, b, c, d) denotes concatenation of E(a), . . . , E(d). 


- (Encrypting the Wiring Information.) The wiring information, except wirings con- 
necting an input gate to an intermediate gate, should be encrypted. Public wiring 
may help the (even semi-honest) adversary to learn more information than the out- 
put of C. Therefore, it is encrypted with the public key stored in columns PK& and 
PK fl . 

- (Conditional Exposure of Wiring Information.) For the computation to proceed, the 
protocol should reveal the wiring information for the rows along the computational 
path. In the beginning, wirings from input gates is public. Along the computational 
path, on each gate, exactly one row should allow decryption of the wiring information. 

- (Oblivious Generation of Wiring Information.) The wiring information are added to 
garbled gates after they are built. It is essential that, even if the truth table is encrypted 
and shuffled, the parties should still be able to add the wiring information. 

Computation of a Circuit Using Wires. Let Tj[a][6] denote the element located at col- 
umn a and row b in Tj. The column Wires contains wiring information, and we de- 
note the column Wires from Tj to r I\ by Wires[j_>fc]0. Looking at the column Wires 
alone, Wires(u) denotes the uth row of this column in the plaintext form. For exam- 
ple, Wires^fc] (2) = (*, *, sk^sL, skkAL ) in Figure 0 We also use Wire(i>, w) to de- 
note the icth element of Wires(u). If Wire^fc] (v, w) f *, it means that T)[Out] [v] = 
Tfc[ln][u>]. In FigureEl for example, we have Wire^/-](2, 3) f * because Ti [Out] [2] = 
T fc M3] = £[0]. 

This wiring information helps the circuit computation to proceed correctly. The com- 
putation proceeds in order from input gates to output gates. In Figured for example, if 


5 If Gi has another outgoing wire, say to G m , Ti will have another column Wires^ _> OT ] . 
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Connecting the Gates: Fill in Wires Columns. 

- For every Wire[, ; ^ fc ] (v. w) of an intermediate gate Ti, run M-CODE for c ou t = Ti[Out][u], 
a n = Tfe[ln]H and c key = T fc [SK][iu], with the key a = Ti[PK L ][v] • Ti[PKji][u]. 

- For every Wirey^ (v, w) of an input gate Tj, run M-CODE for c,„ Lt = Tj [Out] [v], Cj„ = 
T k [In] [tn] and c key = T k [SK] [w], with the trivial key z = g°. 

Depending on the circuit topology, the subscript of a column may differ (e.g.. Ini,, Inn, or In). 

Local Computation. Each party computes the output of C using the M-CODE transcripts on 
the input gates. 


Fig. 3. Online Stage of Vi 


row 2 of Ti and row 1 of Tj are on the computation path, then row 3 of T k is also on 
the computation path because w = 3 is the only row where Wire[j_>/.] (2, w) ^ * and 
Wirey^jfi^*. 

Constructing Wires. We implement each Wire^/.] (v, w) using a M-CODE transcript 
for c out = Tj[Out][u], c in = T k [\n][w], c key = T fc [SK][«;], and * = Ti[PK][v$ This 
directly satisfies the requirements of encrypted wiring and oblivious wiring generation. 
Conditional exposure is achieved by executing M-CODE protocols in the input layer 
with a trivial public key z = 1, so that the wiring information in the input layer is 
known to every party. 

The description of V\ can be found in Figure 0 Running the online stage takes 
two rounds. The communication complexity of V\ is 0(nk\C\) (plus the NIZK, if we 
assume the CRS case) where \C\ is the size of the circuit. 

3.2 Second Protocol (V 2 ) 

The idea of V 2 is that in a preprocessing stage, the parties generate a garbled circuit, 
using Yao’s technique, of a universal circuit. The garbled circuit has a restriction on 
the keys of input wires, that allows the online computation to take only one round in 
our model, as opposed to the two-round OT based approach of Yao. The preprocessing 
stage can be done in constant number of rounds, using general mpc protocols IIkos()31 
ItpsOKI . 

Preprocessing Stage: Garbling Universal Circuit. The first step is to establish a 
global public key y for ElGamal encryption. The computing parties have shares of the 
corresponding secret key x. In contrast to protocol Vi, however, here, ElGamal encryp- 
tion is used only for input layer. 

Next, a garbled circuit for universal circuit is generated, using Yao’s garbled circuit 
technique EH3- In the generation procedure, for each wire i, two random keys, w}-, 
and w\ are generated. The key w' Q (resp., w\) represents 0 (resp., 1) for wire i. For each 
gate Gj, a truth table Tj is generated. In each table, a private key encryption (denoted 

6 Depending on the circuit topology, if this is a left input or right input to the gate, the pair 
(a n , c key ) may also be (T b [ln £ ]M, T 6 [SK L ]M) or (I^MM, T 6 [SKfl]M). 
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Fig. 4. Garbled Truth Tables for the Gates ( Gi , Gj, Gk , Ge). The topology of the gates is given 
on the right. Gj is an input gate, Ge is an output gate, and Gi, Gk are intermediate gates. Table 
T x is the truth table describing gate G x .y is the global public key. Encryption E is a private key 
encryption based on pseudorandom function with efficient verifiable range IlLPlWl . 


G, E, D) with efficiently verifiable range (based on pseudorandom function) is used 
1 lpU9iH . Figure 0 shows the structure of the garbled circuit. 

- Recall that we assume all output gates are identity gates, with only one incoming 
wire and only two rows in the corresponding truth table. Each row encrypts the 
Boolean value represented by the corresponding wire, and the rows are randomly 
shuffled. An example is given in Figure^ in the first row of table Ge, the input value 
is 1 (the key w b represents 1), and it encrypts 1, which is the output value of this 
row. 

- For all other gates, each gate has two incoming wires and four rows. Each row en- 
crypts a key for the outgoing wire, which represents the appropriate Boolean value 
of NAND of the incoming wires’ values, and the rows are randomly shuffled. For 
example, in Gk of Figure H the first row encrypts Wq , representation of 0 for wire 
k, since NAND of the values that the keys of the incoming wires represent (i.e., the 
value 1 represented by w\ in wire i, and the value 1 represented by w[ in wire j) 
is 0. 

To construct a secure protocol for mp-ced, we depart from the traditional Yao garbed 
circuit technique, by giving restriction on input wires. 

- A random element /i e is chosen, which no party knows, and H = E y (h ) is 
published. We emphasize that H is generated once and for all. In other words, every 
instance of garbled universal circuit can use the same H. 

- For input wire j, two keys vf : , w{ G are randomly generated, conditioned on 
w{ = h - w 3 q . Only the encryption of the first key, Wq := Eyiwfj is published. 
Since we garble a universal circuit, it suffices to know a bound on the sizes of circuits 

to be evaluated later. A universal circuit of size 0(k log k) can accept circuits of size k 
as inputs IIKS08I . 

Input contribution is performed such that for input b G {0, 1}, a ciphertext c = 
(ci, C2) = E y (h b ) is published. 

7 Roughly speaking, in such an encryption scheme, given a ciphertext and a key, it is efficiently 
verifiable whether the given ciphertext was encrypted under the given key. This helps comput- 
ing parties to correctly compute the garbled circuit. 
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- When input is 0, publish E y (l). 

- When input is 1, publish a re-encryption of H (recall H = (H \ , H 2 ) = E y (h)). 

A proof of knowledge is added, PK{r : (ci = g T , C2 = y r ) or (ci = Hi g r , C2 = 

H 2 y r )}- 

Online Stage: Obtaining keys for input-wires. Computing parties need to obtain a 
key, for each wire j, that represents the Boolean value b that the corresponding input 
ciphertext encrypts - that is, w b . But the key should not leak any information about 
the input ciphertext. Our protocol meets such requirement by using homomorphism of 
ElGamal encryptioifj. Let Cj be the ciphertext of contributed input b £ {0, 1} for input 
wire j. Computing parties work as follows: 

- For every input wire j, compute W 3 = Wq ■ cj locally using homomorphism of El- 
Gamal encryption. Then, decrypt via threshold decryption by computing parties 
using their shares for x. This gives w J b , which matches the input b. 

- Each party computes the output of C using the key w b locally. 

Running the online stage in V 2 takes one round. The communication complexity of 
V 2 is 0(nW\C\ log IC'D (plus the NIZK, if we assume the CRS case) where \C\ is the 
size of the circuit. 


MP-CED vs. MPC with Preprocessing. General mpc and mp-ced can be reduced to 
each other. 

- Given a protocol 7 r for mp-ced, we can construct a protocol tt' for mpc with prepro- 
cessing, as follows. In the preprocessing stage of tt' , the parties share an encryption 
key. In the online stage of tt' , each party publishes encryption of its input under the 
shared key, and the parties follow protocol 7r. The resulting mpc protocol tt' requires 
one more online rounds than the underlying protocol n. This approach is implicitly 
used in llFH96ll!7nnilCDN()lirDN():-SII . 

- Given a protocol tt' for mpc, we can construct a protocol 7r for mp-ced, as follows. 
In mp-ced, the parties share a secret key, and the inputs are encrypted. Protocol tt 
should compute C on these given input ciphertexts. This can be done by the par- 
ties running protocol tt' using a circuit C' derived from C. Circuit C' consists of 
two stages: the first stage of C' gets shares of the secret key and the ciphertexts, 
and decrypts the ciphertexts to give plaintexts. The second stage of C' essentially 
evaluates C on these plaintext inputs from the first stage. In running the protocol 7 r, 
each party’s input is its share for the secret key. Circuit C' has more gates than C. 
However, if the round complexity of 7r' does not depend on the depth of the circuit, 
then the round complexity of 7r is the same as the round complexity of tt' . 

On Basing MP-CED on Doubly-Homomorphic Encryption. Recently, Gentry con- 
structed a doubly homomorphic encryption scheme using ideal lattices E3EI, which 

8 In fact, any homomorphic encryption can be used. We chose to use ElGamal encryption since 
it is already used in Vi. 
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solves the CED problem. Since our goal is to give a round-efficient protocol, it is an 
interesting question whether doubly-homomorphic encryption allows non-interactive 
secure computation. However, this seems unlikely. 

- (Threshold Decryption.) It’s not known whether Gentry’s scheme supports threshold 
decryption. Thus, there has to be at least one party which can decrypt ciphertexts by 
itself. If this party sees the inputs (which are encrypted and pubhshed in the mp-ced 
model), it can decrypt private inputs of other parties and break the security. Thus, 
there must be a separation between parties who can decrypt and parties who get 
access to the input and intermediate ciphertexts. 

- (Malicious Parties.) Parties without decryption capability would compute a circuit on 
encrypted inputs using double homomorphism. In order for the protocol to compute 
output in a plaintext form, they have to submit some ciphertexts to a party with 
decryption capability. In the malicious setting, to make sure that they applied doubly 
homomorphism correctly, some kind of zero-knowledge proof should be added to the 
ciphertexts they submit. However, it is not clear how such a proof can be constructed 
when the verifier has the decryption capability - as mentioned above, it must not see 
the input ciphertexts. 

The above issue also stands against achieving mpc protocols against an active adversary 
with doubly homomorphic encryptions. 
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A Explicit Preprocessing of Vi 

During the preprocessing stage, the parties can generate polynomial number of garbled 
gates, which can later be used for evaluating circuits. Therefore it suffices to know a 
bound on the sizes of circuits to be evaluated later. We show how to generate such truth 
tables explicitly given y as a global public key. 

Throughout, each bit b will be encrypted with plaintext g b . Denote by (m) a simple 
ElGamal ciphertext (with randomness r = 0 ): ( 1 , m). For an ElGamal ciphertext c for 
a bit, its negation -ic is defined as ( g x )/c . For two ElGamal ciphertext a = (ai, a 2 ) and 
b = (b-[ , b-2), define ZKe„(a, b) — the proof that b is a re-encryption a with public key 
u — as PK{r : b\ = g r a,\ . 62 = u r 0,2}- When public key is not specified, ZKe means 
ZKe y . The construction details can be found in Annendix lA .31 


A.l Preliminaries: Joint Generation of Garbled Gates 

We associate a gate with the truth table for it. The entries of the truth tables are encrypted 
Boolean values, and the rows of each truth table are permuted, such that only a threshold 
of the parties can (1) recover any plaintext and (2) learn the permutation of the rows. 
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Sampling a Random Encrypted Boolean Value. In this protocol, n parties perform 
an oblivious analogue of XORing their respective random bits in n rounds. In our case, 
semantic security of ElGamal and the soundness of the attached proof guarantee they 
cannot. 

1. Each party Pi selects ch €r {0, 1} and computes Si = E y (g a% ), m = ZKe((g°), Si) V 
ZKe((g' 1 ), Si) and broadcasts (Si,Tri). Let S = {j : 7 ly is valid}. Set a <— aHH where 


2. Forj = 2,...,|5|: 

Let i be the j-th smahest element in S. Pi computes an encryption di such that di = 
(d»i, da ) is a re-encryption of a if a, = 0 or a re-encryption of ~>a otherwise. Then Pi 
broadcasts (di,i/)i) where 

Ik = (zKe({g°),Si) A ZKe(a, di)) V (zKe({/),£i) A ZKe(-.o, *)) . 

If ib,. is valid, then each party sets a <— di. 


As in computing xor, it is enough that one of the bits is random (or, in our case, that one 
party is honest) to guarantee a random output as long as corrupt parties can not have 
their bit choices depend on the bits of other parties. The invariant of the protocol is that 
at the end of each round the ciphertext a encrypts exclusive-or of a,’s so far. 


Generating a Garbled IDENTITY Gate. First, run the procedure of sam- 
pling a random encrypted Boolean value. Let the output of the procedure is 
a. The first row of an IDENTITY gate is a, and the second row is computed 
by negating the value of a. 


Generating a Garbled NAND Gate. 

1. Each party Pi selects a, , 6,; Er {0, 1} and computes a) = E y (g ai ), bi = E y (g bi ), i n = 
ZK e((g°),Si) V ZK e((g 1 ),ai), and 4>i = ZKe((g°), &i) V ZKe^g 1 ), bi), and broadcasts 

2. Run the procedure of sampling random encrypted Boolean values with Si’s. Let a be the 
output of the procedure. Let S = {j : tt, and <f>j are valid}. Set 6 <— (g°) and ab <— (g°). 

3. For j = 1 ..... |.S’| : Let i be the j-th smallest element in S. Pi computes encryptions di and 
Si such that 


- If bi = 0, then di is a re-encryption of b and Si is a re-encryption of ab. 

- If bi = 1, then di is a re-encryption of ->6 and S% is a re-encryption of a/ab. Then Pi 
broadcasts ( di,Si,ij)i ) where = V'? V V’i f° r V’i = ZKe((g°),6<) A ZKe(&, di) A 
ZKe(a6, Si) and = ZKe((g 1 ), bi) A ZK e(-i&, di) A ZKe(a/a6, Si). If ipi is valid, then 
each party sets b <— di and ab *— Si. 

The invariant of the loop is that at the end of each round the cipher- 
text ab encrypts exclusive-or of abi s so far. After a, b, ab are gener- 
ated, each party Pi can complete the truth table, by locally negating 
the ciphertexts as described in the table. 


ln L 

Inn 

Out 

a 

b 

-i ab 

a 

-.6 

ab ■ (-id) 

->a 

b 


-id 


a-'b/ab 
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A.2 Preliminaries: Jointly Recoverable Encrypted ElGamal Key Pairs 

Verifiable ElGamal Encryption of Discrete Logarithm. To generate a jointly recover- 
able encrypted ElGamal key pair, we first introduce the following verifiable encryption 
of discrete logarithm. 

Let 7 := g z . We want to encrypt z in a verifiable manner. Let z t be the i-th rightmost 
bit of 2 for i G [k]. The verifiable encryption is E y (z) = (£o, . . . , 7t) , where 

Zi = E y (g Zi ' 2 ') for i G [k]. The proof 7r is 

(/\ (zKe(( ff °),? i ) VZKe((/),F i ))) AZKe(< 7 >,na)- 

When we get {g z °' 2 ° , ■ ■ ■ , g Zk - 1 ' 2 '' 1 ) by decrypting E y (z), z can be extracted via ex- 
haustive search in polynomial time in k because each Zi is a bit. 

Note that the encryption scheme is homomorphic if we ignore the proof part. Multi- 
plication of two verifiable encryptions E y (z ) = zfE \ ) and E y (w) = (25i, . . . , 

tSk^i) is defined as E y (z) ■ E y (w ) = (zi -wi ,. . iCi • uik^i) . 

Generation of Jointly Recoverable Encrypted ElGamal Key Pairs. For simplicity, 
we omit the proof part of the verifiable encryption from the presentation below. Gener- 
ation of a key pair can be done as follows: 

1 . Each party Pj runs ElGamal key generation and obtains (fc, , g kj ) . It broadcasts (g kj , E y (kj)). 

2. Let S be the set of parties whose encryptions are verified. In the PK column, n,es 9 kj ' s set- 
in the SK column, J7 , €g E y (kj) is set. 

Extraction of the secret key. Let (To, • ■ • , ^k-i) := EIjcs E y {kj). Let g Zi be the de- 
cryption of Yi. Then given (g z ° ■ . . . , g Zk ~' ), we can extract the secret key kj = 
Y,i Zi by finding each Zi via exhaustive search, which can be done efficiently since 

gZi 

A.3 Preprocessing of 

The preprocessing takes 2 n rounds, since step 1.1 and step 1.2 can be executed concur- 
rently. This protocol is UC-secure, but for lack of space, we defer the proof of security 
to full version. 

Step 1.1: Garbled Circuit Generation - Intermediate Gates. For each NAND gate, 
run the procedure of joint generation of garbled NAND gate in Appendix IA.1I to fill 
in In and Out Columns. For each pair of columns PK and S K, run the procedure of 
jointly recoverable encrypted ElGamal key pairs in Appendix lA.dfl The above tasks 
are executed in parallel. 

Step 1.2: Garbled Circuit Generation - Output Gates 

1. Run the procedure of sampling random encrypted Boolean values in Annendix IA. II where 
each party Pi selects a* e r {0, 1}. Let a be the output of the procedure and let S = {j : 
Pj behaved honestly during the procedure}. Fill in In and Out Columns as an IDENTITY gate. 

9 Now in the online stage, k instances of M-CODE are executed since T(,[SK][w] contains k 
ElGamal ciphertexts. The communication complexity blows up by multiplicative factor of k. 
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In addition, run the procedures of jointly recoverable encrypted ElGamal key pairs in 
Annendix I A. 21 to fill the columns PK and SK. Let zi and zi be the two keys in the column 
PK. 

2. In order to fill Final column, each party Pi such that i e S broadcasts (ajjji m E Z1 (g' H ), 
ai~^ 2 = E Z2 (g ai )). Set <— (g°),a^ 2 <— (g 1 )- Parties jointly compute E Z1 (g® i ai ) and 
E^g 1 -®^). In particular, fori = 1, , |5j: 

(a) Let i be the j-th smallest element in S. Pi computes encryptions di, ej such that di (resp. 

ej) is a re-encryption of (resp. dP~P 2 ) if a% = 0 or a re-encryption of (resp. 

-| dj^j 2 ) otherwise. Then Pi broadcasts (di, el, tpi) where ipi = rpi V ip} for 

/e = ZKe((g°),Oi) A ZKe^V^i) A ZKe, 2 (( S °),d~ ) A 
Z Ke zi (dTi , di ) A Z Ke Z2 (dT 2 , el ) and 
W = ZKeC^ 1 ), Si) A ZKe Zl ((g 1 ),^) A ZKe„((f*), afc) A 
ZKe zl (-nQ , di) A ZKe Z2 (-.a^ , ej). 

(b) If i pi is valid, then each party sets oQ «— di, and dj 2 <— ej. Otherwise, in the case of hon- 
est majority, parties collectively compute a; from threshold decryption using (yi, ... , y n ) 
and compute cQ , dj 2 accordingly. In the case of honest minority, the protocol aborts. Fi- 
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Abstract. We present a new construction of non-committing encryption 
schemes. Unlike the previous constructions of Canetti et al. (STOC ’96) and of 
Damgard and Nielsen (Crypto ’00), our construction achieves all of the following 
properties: 

- Optimal round complexity. Our encryption scheme is a 2-round protocol, 
matching the round complexity of Canetti et al. and improving upon that in 
Damgard and Nielsen. 

- Weaker assumptions. Our construction is based on trapdoor simulatable 
cryptosystems, a new primitive that we introduce as a relaxation of those 
used in previous works. We also show how to realize this primitive based on 
hardness of factoring. 

- Improved efficiency. The amortized complexity of encrypting a single bit is 
0(1) public key operations on a constant-sized plaintext in the underlying 
cryptosystem. 

As a result, we obtain the first non-committing public-key encryption schemes 
under hardness of factoring and worst-case lattice assumptions; previously, such 
schemes were only known under the CDH and RSA assumptions. Combined 
with existing work on secure multi-party computation, we obtain protocols for 
multi-party computation secure against a malicious adversary that may adaptively 
corrupt an arbitrary number of parties under weaker assumptions than were 
previously known. Specifically, we obtain the first adaptively secure multi-party 
protocols based on hardness of factoring in both the stand-alone setting and the 
UC setting with a common reference string. 

Keywords: public-key encryption, adaptive corruption, non-committing encryp- 
tion, secure multi-party computation. 

1 Introduction 

Secure multi-party computation (MPC) allows several mutually distrustful parties to 

perform a joint computation without compromising, to the greatest extent possible, 
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the privacy of their inputs or the correctness of the outputs. An important criterion 
in evaluating the security guarantee is how many parties an adversary is allowed to 
corrupt and when the adversary determines which parties to corrupt. Ideally, we want 
to achieve the strongest notion of security, namely, against an adversary that corrupts 
an arbitrary number of parties, and adaptively determines who and when to corrupt 
during the course of the computation (and without assuming erasure^). Even though 
the latter is a very natural and realistic assumption about the adversary, most of the MPC 
literature only addresses security against a static adversary, namely one that chooses 
(and fixes) which parties to corrupt before the protocol starts executing. And if indeed 
such protocols do exist, it is important to answer the following question: 

What are the cryptographic assumptions under which we can realize 
MPC protocols secure against a malicious, adaptive adversary that 
may corrupt a majority of the parties? 

Towards answering this question, we revisit the problem of constructing non- 
committing encryption schemes, a cryptographic primitive first introduced by Canetti 
et al. IIcfgn96I as a tool for building adaptively secure MPC protocols in the presence 
of an honest majority. Informally, non-committing encryption schemes are semantically 
secure, possibly interactive encryption schemes, with the additional property that a 
simulator can generate special ciphertexts that can be opened to both a 0 and a 1. In a 
more recent work, Canetti et al. Hr i . os 0211 (extending llB98l t showed how to construct 
adaptively secure oblivious transfer protocols starting from non-committing public-key 
encryption schemes (i.e. the key generation algorithm must be non-interactive), which 
may in turn be used to construct MPC protocols secure against a malicious, adaptive 
adversary that may corrupt an arbitrary number of parties. 

Unfortunately, the only known constructions of non-committing public -key encryp- 
tion schemes (PKEs) are based on the CDH and RSA assumptions IIcfgn96H and 
the construction exploits in a very essential way that these assumptions give rise to 
families of trapdoor permutations with a common domain. If we allow for an interactive 
key generation phase, Damgard and Nielsen IIdnOOI . building on IIb971 Icfgn96I1 . 
constructed 3-round non-committing encryption schemes based on a more general 
assumption, that of simulatable PKEs, which may in turn be realized from DDH, CDH, 
RSA and more recently, worst-case lattice assumptions IIgpvOXI (see figured. 

1.1 Our Results 

First, we present a new construction of non-committing encryption schemes, which 
simultaneously improves upon all of the previous constructions in IICFGN96 Udn()0I : 

Optimal Round Complexity. We provide a construction of non-committing PKEs from 
simulatable cryptosystems. Our construction is surprisingly simple - a twist to the 
standard cut-and-choose techniques used in IIonOOIItcoTT^TI - and also admits a fairly 


1 Refer to IcOOl Section 5.2] for a discussion on how trusted erasures may be a problematic 
assumption. 
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straight-forward simulation and analysis. In particular, our construction and the 
analysis are conceptually and technically simpler than those in IIcfgn9(SI IdnOOH : 
we avoid having to analyze the number of one’s in certain Binomial distributions 
as in IICFGN96II and to consider a subtle failure mode as in IIdnOOI . 

Reducing the assumptions. Informally, a simulatable PKE is an encryption scheme 
with special algorithms for obliviously sampling public keys and random cipher- 
texts without learning the corresponding secret keys and plaintexts; in addition, 
both of these oblivious sampling algorithms should be efficiently invertible. 

We define a weaker assumption, which we refer to as trapdoor simulatable 
cryptosystems, and prove that it is sufficient for our construction and analysis to 
go through. Roughly speaking, we provide the inverting algorithms in a simulatable 
cryptosystem with additional trapdoor information (hence the modifier “trapdoor”), 
which makes it easier to design a simulatable cryptosystem. 

Improved efficiency. While the main focus of this work is feasibility results (notably, 
reducing the computational assumptions for both non-committing encryption 
schemes and adaptively secure MPC), we show how to combine a variant of our 
basic construction with the use of error-correcting codes to achieve better efficiency. 
That is, the amortized complexity of encrypting a single bit is 0(1) public-key 
operations on a constant- sized plaintext in the underlying cryptosystem. 

Thus, we obtain the following. 

Theorem 1 (informal). There exists a black-box construction of a non-committing 
public -key encryption scheme, starting from any trapdoor simulatable cryptosystem. 

Factoring-Based constructions. Next, we derive trapdoor simulatable cryptosystems 
from a variant of Rabin’s trapdoor permutations (c.f. IIH99I Is9fil Iff()2I 1 based on the 
hardness of factoring Blum integers. 

Theorem 2 (informal). Suppose factoring Blum integers is hard on average. Then, 
there exists a trapdoor simulatable cryptosystem. 

We stress that we do not know how to construct a simulatable cryptosystem under 
the same assumptions; specifically, inverting the sampling algorithm for ciphertexts in 
our construction without the trapdoor (the factorization of the Blum integer modulus) 
appears to be as hard as factoring Blum integers. This shows that trapdoor simulatable 
cryptosystems is indeed a meaningful and useful relaxation. In the process, we also 
obtain the first factoring-based dense cryptosystems^ When combined with enhanced 
trapdoor permutations, this yields the first factoring-based non-interactive proofs of 
knowledge IIdp92II . 

Oblivious Transfer and MPC. We consider the applications of our main result to the 
constructions of adaptively secure oblivious transfer and general MPC protocols in both 
the stand-alone setting and the UC setting (c.f. IIct.os()2|Itps()KIIcdsmw()9I '). 


These are PKE schemes where a random string has a inverse polynomial probability of being 
a valid public key. 
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CDH, RSA s- simulatable common-domain TDP *- 2-round NCE -< 



DDH, LWE 


simulatable PKE 


3-round NCE 



factoring BI ^ trapdoor simulatable PKE ' 

Fig. 1. Summary of previous results (solid lines) along with our contributions (dashed lines) 

Theorem 3 (informal). There exists a black-box construction of a 6-round 1 -out- of - 
l oblivious transfer protocol for strings in the J^com - hybrid modeQ in the UC setting 
that is secure against a malicious, adaptive adversary, starting from any trapdoor 
simulatable cryptosystem. 

We add that if the oblivious key generation algorithm in the trapdoor simulatable 
cryptosystem achieves statistical indistinguishability (which is the case for all of the 
afore-mentioned constructions), then we obtain an OT protocol that is secure against a 
computationally unbounded malicious sender. While our OT protocol is not as efficient 
as that in the recent work of Garay, Wichs and Zhou llowz()9B (we incur an additional 
multiplicative overhead that is linear in the security parameter), our protocol along with 
our general framework offers several advantages: 

- In addition to relying on the !Fcau functionality and a simulatable PKE (to 
implement non-committing encryption) as in our work, the IIGWZ09I framework 
requires a so-called enhanced dual-mode cryptosystem. This is a relatively high- 
level CRS-based primitive from IIpvw()8I augmented with two main additional 
properties: the first has a flavor of oblivious sampling; the second requires that the 
underlying CRS be a common random string (modulo some system parameters) 
and not just a common reference string. This requirement is inherent to their 
framework, since this CRS is generated using a coin-tossing protocol. This latter 
requirement is very restrictive, and the only known construction of an enhanced 
dual-mode cryptosystem is based on the quadratic residuocity assumption. 

- Our protocol immediately handles 1-out-of-f OT, whereas llowz()9l only addresses 
l-out-of-2 OT, a limi tation inherited from IpvwOXI . 

Combined with IIgt.OS()2|Itps()8|Igdsmw() 9II . we obtain the following corollaries: 

Corollary 1 (informal). Assuming the existence of trapdoor simulatable cryptosys- 
tems, there exists adaptively secure multi-party protocols in the stand-alone setting and 
in the iF C oM -hybrid model in the UC setting against a malicious adversary that may 
adaptively corrupt any number of parties. 

Specifically, we obtain the first adaptively secure multi-party protocols based on 
hardness of factoring in both the stand-alone setting and the UC setting with a common 
reference string. 


J~com is an ideal functionality for commitment. 
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1.2 Additional Related Work 

The problem of constructing encryption schemes that are secure against adaptive 
corruptions was first addressed in the work of Beaver and Haber IIRH92H . They 
considered a simpler scenario where the honest parties have the ability to securely 
and completely erase previous states. For instance, an honest sender could erase 
the randomness used for encryption after sending the ciphertext, so that upon being 
corrupted, the adversary only gets to see the corresponding plaintext. An intermediate 
model, wherein we assume secure erasures for either the sender or receiver but not both 
(or, by limiting the adversary to corrupting at most one of the two parties), has been 
considered in several other works II.tt,()()|Ichk()5IIko()31 . 

Organization. We present an overview of our constructions in Sectional preliminaries 
in Section^ the formulation of a trapdoor simulatable PKE in Section^ our factoring- 
based trapdoor simulatable PKE in Section 0 and our non-committing encryption 
scheme in Section 0 In Section 0 we show the construction of a 6-round oblivious 
transfer protocol. 

2 Overview of Our Constructions 

At a high level, our non-committing PKE is similar to that from previous works 
IIcfgn96I IdnOOI IKO04II . The receiver generates a collection of public keys in such 
a way that it only knows an a fraction of the corresponding secret keys; this can 
be achieved by generating an a fraction of the public keys using the key generation 
algorithm and the remaining 1 — a fraction obliviously. Similarly, the sender generates 
a collection of ciphertexts in such a way that it only knows an a fraction of the 
corresponding plaintexts. Previous constructions all work with the natural choice of 
a = 1/2 so that the simulator generates a collection of ciphertexts half of which 
are encryptions of 0 and the other half are encryptions of 1. As noted in HKO04II . 
this is sufficient for obtaining non-committing PKEs wherein at most one party is 
corrupted. Roughly speaking, the difficulty in handling simultaneous corruptions of 
both the sender and the receiver with a = 1/2 is that in the simulation, the sender’s 
choice of the a fraction of keys completely determine the receiver’s choice of the 
a fraction of ciphertexts whereas in an actual honest encryption, these choices are 
completely independent (we elaborate on this later in this section). The key insight 
in our construction is to work with a smaller value of a (turns out 1 / 4 is good enough). 

A Toy Construction. Consider the following encryption scheme, which is a simplifi- 
cation of that in IIKO04I IdnOOI . The receiver generates a pair of public keys (pko, PKi) 
by generating one key (selected at random) using the key-generation algorithm, and the 
other using the oblivious sampling algorithm. To encrypt a bit b, the sender generates a 
pair of ciphertexts (Co, Ci) as follows: pick a random bit r, set C r to be Enc PKr ( b ) and 
choose Ci- r using the oblivious sampling algorithm. To decrypt, the receiver decrypts 
exactly one of Co, Cl using the secret key that it knows. This construction corresponds 
to a = 1/2 where a is the fraction of public keys for which the receiver knows the 
secret key, and also the fraction of ciphertexts for which the sender knows the plaintext. 
Observe that this encryption scheme has the following properties: 
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- It has a constant decryption error of 1/4 if an obliviously sampled ciphertext is 
equally likely to decrypt to 0 or 1. As shown in Hko() 41 . this error can be reduced 
by standard repetition techniques. 

- It tolerates corruption of either the sender or the receiver, but not both. Consider a 
simulator that generates both of (pk 0 , PKi) (along with SKo, SKi) using the key- 
generation algorithm, and a ciphertext (Co, C\) as follows: pick a random bit (3, 
and set Co to be Enc PKo (/?) and C\ to be Enc PKl (l — (3). Suppose the simulator 
later learns that this is an encryption of 0. If only the sender is corrupted, the 
simulator claims r = (3 and that Ci-@ is obliviously sampled. If only the receiver 
is corrupted, it claims that it knows SKg and that PKi is oblivious sampled. 

We highlight two subtleties in the above simulation strategy. First, it achieves 0 
decryption error (as opposed to 1/4 in an honest encryption); this can be fixed with a 
somewhat more involved simulation strategy. This in turn becomes pretty complicated 
once we use standard repetition techniques to reduce the decryption error. Next, it is 
always the case in the simulation that either both pk 0 and Co are obliviously sampled, 
or both PKi and Ci are obliviously sampled. As such, this simulation strategy fails if 
both the sender and the receiver are corrupted, because in an actual encryption, which of 
pk 0 , PKi and which of Co, Cl are obliviously sampled are determined independently. 

Our Encryption Scheme. As noted in the introduction, the key insight in our 
construction is to work with a small value of a. In addition, following IlDNOOII . we 
use a random fc-bit encoding of 0 and 1 , where k is the security parameter: 

- The receiver generates 4 k public keys PKi, . . . , pk^-: k of them are generated 
using the key-generation algorithm, and the remaining 3fc are generated using the 
oblivious sampling algorithm. The receiver then sends PKi , . . . , PK 4 /- along with 
two random Ar-bit messages Mo, Mi. 

- To encrypt a bit b, the sender sends 4 k ciphertexts (one for each of PKi , . . . , pk^), 
of which k are encryptions of M{,, and the remaining ones are obliviously sampled. 

- To decrypt, the receiver decrypts the k ciphertexts for which it knows the 
corresponding secret key. If any of the k plaintexts matches Mo, it outputs 0 and 
otherwise, it outputs 1 . 

Encoding 0 and 1 randomly as M 0 and Mi is useful for two reasons: 

- That an obliviously sampled ciphertext is equally likely to decrypt to 0 or 1 is 
no longer needed to guarantee correctness (c.f. IdnOOD . Indeed, reasoning about 
decryptions of obliviously sampled ciphertext is non-trivial for the lattice-based 
simulatable PKEs in IIGPV08I . 

- Constructing a simulator becomes much easier as we avoid having to generate 
distributions over k independent biased bits conditioned on the majority of the 
bits being 0, say. Generating such distributions arises for instance in IICFGN96I 
and is related to the first subtlety associated with the naive simulation strategy. 
In our construction, the simulated ciphertext comprises k encryptions of Mo, k 
encryptions of Mi, and 2k obhviously generated ciphertexts. Having these extra 2k 
obliviously generated ciphertexts (which is possible because a < 1 / 2 ) is crucial 
for handling simultaneous corruptions of the sender and the receiver. 
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Trapdoor Simula table PKEs from Factoring. Our factoring-based trapdoor simulat- 
able PKE construction consists of two main steps. First, we modify the Rabin trapdoor 
permutations based on squaring modulo Blum integer so that it remains a permutation 
over any arbitrary integer modulus. This relies on the following number-theoretical 
structural lemma implicit in IIh99IIs961Iff(121F1 : 

Let iV be an arbitrary odd fc-bit integer, and let Qn = {a 2 (mod N) \ a £ 

Z jy}. Then, the map ip : x i— > x 2 defines a permutation over Qn- 

We also provide an efficient algorithm for inverting ip given the factorization of N. 
Note that the standard algorithm for computing square roots does not guarantee that the 
output lies in Qn- Moreover, the probability that a random square root lies in Qn may 
be exponential small so we cannot repeatedly computing random square roots until we 
find one in Qn', it’s also not clear a-priori how to test membership in Qn even given 
the factorization of N. 

The next step transforms the family of trapdoor permutations ip acting on the 
domain Q n into a family of “enhanced” trapdoor permutations with the same domain 
Qn, using an idea from lloMI Section C.l]. The latter has the property that we can 
obliviously sample a random element y in Q n so that given y along with the coin tosses 
used to sample y, it is infeasible to compute the preimage of y under the permutation 
(note that the naive algorithm for sampling a random element of Qn gives away 
its preimage under ip). We will need the oblivious sampling algorithm for a random 
element in Qn in our oblivious sampling algorithm for random ciphertexts. We will also 
need to realize trapdoor invertibility for the latter, which requires an efficient algorithm 
that given the factorization of N and an element y in Qn, outputs a random 2 fe ’th root 
of t/0 Note that iteratively computing random square roots k times does not work: after 
computing the first square root, we may not end up with a 2 k ~ 1 ’th power. 

3 Preliminaries 

If A is a probabilistic polynomial time (hereafter, ppt) algorithm that runs on input x, 
A(x) denotes the random variable according to the distribution of the output of A on 
input x. We denote by A(x; r) the output of A on input x and random coins r. To 
simplify the notation, we will often omit quantifying over the distribution for r; it will 
usually be clear from the context when r is not fixed, that it is drawn from the uniform 
distribution over strings of the appropriate length. 

We assume that the reader is familiar with the standard definitions of public-key 
encryption schemes and semantic security (c.f. Hgm84IIg()4II 1. We stress that we allow 
decryption errors that are exponentially small in k: 

4 It was shown in that ip defines a permutation over the subgroup On of Z* N of odd order, 
and that On contains Qn', turns out On = Qn- While Qn is trivially sampleable, it is not 
clear a-priori how to sample from On- 

5 If we are given just N and not its factorization, this problem is at least as hard as factoring 
random Blum integers. This is in essence why we only obtain a factoring-based trapdoor 
simulatable PKE and not a simulatable PKE. 
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Definition 1 (encryption scheme). A triple (Gen, Enc, Dec) is an encryption scheme, 
if Gen and Enc are ppt algorithms and Dec is a deterministic polynomial-time algorithm 
such that for every message m e {0, 1}* of polynomial length, Pr[Gen(l fe ) — > 
(pk, sk), Enc PK (m) -*• c; Dec SK (c) ± m] < 2~ n( - k \ 

Non-committing Encryption. For simplicity, we present the definition of a non- 
committing public-key encryption scheme for single-bit messages: 

Definition 2 (non-committing encryption IIcfgn9(SI ). A non-committing (bit) en- 
cryption scheme consists of a tuple (NCGen, NCEnc, NCDec, NCSim) where (NCGen, 
NCEnc, NCDec) is an encryption scheme and NCSim is the simulation algorithm that 
on input l k , outputs (e, c, er G , crj), a*, o^) with the following property: for 6 = 0,1 the 
following distributions are computationally indistinguishable: 

- the joint view of an honest sender and an honest receiver in a normal encryption 
ofb: 

{(e,c,o 0 ,cr E ) | (e, d) = NCGen(l fc ; cr G ), c = N CEnc e (6; cr E )} 

- simulated view of an encryption ofb: 

{(e, c, <7 q, Ug) | NCSim(l fc ) (e, c, a°, o°, o£, o£)} 

It follows from the definition that a non-committing encryption scheme is also 
semantically secure. 

Encrypting longer messages. Starting with a non-committing bit encryption scheme 
(NCGen, NCEnc, NCDec, NCSim), we may encrypt a longer message of length n by 
generating n independent public keys using N CGen, encrypting each bit of the message 
using a different public key and then concatenating the n ciphertexts. Note that this is 
different from the case of semantically secure encryption, where we may encrypt each 
bit using the same public key. 

4 Trapdoor Simulatable Public Key Encryption 

A l-bit trapdoor simulatable encryption scheme consists of an encryption scheme 
(Gen, Enc, Dec) augmented with (oGen, oRndEnc, rGen, rRndEnc). Here, oGen and 
oRndEnc are the oblivious sampling algorithms for public keys and ciphertexts, and 
rGen and rRndEnc are the respective inverting algorithm^. We require that, for all mes- 
sages m £ {0, 1}^, the following distributions are computationally indistinguishable: 

{rGen(r G ), rRndEnc(r 0 , r E , m), pk, c | (pk, sk) = Gen(l fc ; r a ),c = Enc PK (m; r E )} 
and{f G ,f E ,pk, c | (pk, _L) = oGen(l fe ; r G ), c = oRndEnc P - K (l fc ; r E )} 

It follows from the definition that a trapdoor simulatable encryption scheme is also 
semantically secure. 

6 Existence of such inverting algorithms is called trapdoor invertibility. Compared to the 
simulatable cryptosystem (without trapdoor) defined in IdnOUI . rGen (resp. rRndEnc) takes 
r a (resp. (r a , r E , m)) as the additional trapdoor information. 
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Encrypting longer messages. We note that if we started only with a trapdoor simulatable 
PKE for single bits, we may encrypt a longer message of length n by generating a single 
public key pk using Gen, and concatenating each of the message encrypted under pk. 

5 Non-Committing Encryption from Weaker Assumptions 

Theorem 4. Suppose there exists a trapdoor simulatable encryption scheme. Then, 
there exists a non-committing encryption scheme as well as a universally composable 
oblivious transfer protocol secure against semi-honest, adaptive adversaries. 

We show how to construct a non-committing bit encryption scheme (NCGen, NCEnc, 
NCDec, NCSim) from a fc-bit trapdoor simulatable PKE (Gen, Enc, Dec) (augmented 
with (oGen, oRndEnc, rGen, rRndEnc)). This is sufficient to establish the theorem by 
the connection between encrypting single bits and multiple bits as discussed in Sections 
0and0 Our construction is presented in Figures 0 and 0 

Correctness. We begin by establishing correctness. 

- Assume that the input [ci, C4&] to the decryption algorithm is a random 

encryption of 0. Recall that J = {Dec SKl (c,) i € T } and we will output 0 
unless M 0 £ J. It is easy to see that Pr [M 0 J] < ( 3 fc fc )/( 4 fe fc ) + 2~ n( ' k ' ) where 
the first summand comes from the probability that 5flT = 0 and the second 


Key Generation NCGen(l fc ): 

1. Pick Mo, Mi at random from (0, l} fc . 

2. Choose a random subset T C [4k\ of size k. 

3. For i — 1, 2, ... , 4 k, generate a pair (PK;, SK») as follows: 



Sete *= [M 0 , Mi,pki, . . . , pk^] and d =* JfJ SKi, . . . , SK 4 fc]. 


Encryption NCEnc PK (6): 

1. Choose a random subset S C [4/c] of size k. 

2. For / — 1. 2, ... , 4k, generate a ciphertext c t as follows: 


f Enc PKi (M;,) ifieS 
1 oRndEnc PKi (l fc ) otherwise 


{: 


Setc = [ci, . . . ,c 4 fc]. 


Decryption NCDec PK (c): 

1. Compute J = {Dec PKi (cj)[* € T}. 

2. If Mo € J, output 0; else, output 1. 


Fig. 2. Non-Committing Encryption Scheme (NCGen, NCEnc, NCDec) 
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Simulation NCSim: 

1 . Pick Mo , Mi at random from {0, l} fc . 

2. Picking the sets So, Si, To, Ti: 

- Pick two random subsets So, To of [4 k] each of size k. 

- Pick two random subsets Si,Ti of [4fc]\(SfoUlb) such that | < S'iflTi| = |5oflTo|. 

3. Generating the keys: for i = 1, 2, . . . , 4 k, set 


Gen(l fc ; Vq) if i G To U So U Ti U Si 
oGen(l fc ;fo) otherwise 


(PKi,SKi) = 


4. Generating the ciphertext: for * = 1, 2, , 4k, set 



5. Simulating an opening to 6: set <Jq = {T b , v!^ 1 , i/Jv 4fc } and cr| = 
{Sbjifc 1 , ■ ■ ■ ,u\' ik }, where 

(ri if ieT b 

Ua’ 1 = s rGen(rJ) if i G To U Ti U So U Si \ T b 

y r’ G otherwise 

! 4 if* € S b 

rRndEnc(ro, r-j, Mi_i) ifie5i-i, 
fj otherwise 

Set e = [Mo, Mi, PKi, . . . , PK4fc], c = [ci, . . . , C4fc]. Additionally output a°, <7 °, ctq, Ce- 


Fig. 3. Non-Committing Encryption Scheme NCSim 


bounds the probability of a decryption error in the underlying encryption scheme 
(Gen, Enc, Dec). 

- Assume that the input [ci , . . . , C4fc] to the decryption algorithm is a random 
encryption of 1. Recall that J = {Dec SKi ( g) i 6 T} and we will output 1 
unless Mo G J. To bound Pr[Mo G J], observe that the distribution of J depends 
only on Mi, PKi, . . . , PK4 T, SKi, . . . , SK4^ and the coin tosses used to generate 
Ci, ... , 04*;, and is therefore independent of the choice of a random M 0 . This means 
that for each i G T, the probability that Dec SK , (g) equals Mo is 2~ k . Taking a 
union bound, we obtain Pr[Mo G J] < k- 2~ k . 

Security. We need to show that for each b = 0,1, a normal encryption of b and a 
simulated encryption of b are computationally indistinguishable. Note that the view in a 
normal encryption of b contains two sets T, S which we will label as T b , S b and we will 
append to the view two sets T\_ b , .S’i_h that are determined as follows: pick two random 
subsets 5i_b,Ti_b of [4fc]\(-S l 6UT6) such that | Si nil = |5 0 nTo|; call this distribution 
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H 0 . We will also append to the view in a simulated encryption of b the sets Ti_ 6 , Si _6 
as determined by the experiment NCSim; call this distribution H^. We will show that 
the augmented distributions Hq and are computationally indistinguishable in two 
steps: 

Reasoning about the sets. First, we claim that the 4-tuple (So, Tq. Si , Ti) in the 
augmented distribution Hq and in H±k are identically distributed. If b = 0, this is 
obvious since the distributions are defined in exactly the same way. The case for b = 1 
follows from a symmetry argument, namely that if we switch (S'o, To) with (Si, Ti) in 
the experiment NCSim, we get exactly the same distribution. Henceforth, it suffices to 
argue that Ho and H^ are computationally indistinguishable, conditioned on some 
fixed (So, To, Si, Ti) in both Hq and H^. We may now WLOG focus on the case 
b = 0. In fact, we may as well also fix Mo, Mi in both H 0 and H^- In addition to 
So, To, Si, Ti, Mo, Mi, the distributions Hq, H^. comprise: 

- 4 k public keys PKi, . . . , pk^ (generated using either Gen or oGen); 

- 4 k ciphertexts e*, . . . , C 4 *. (generated using either Enc or oRndEnc); 

- 4 k sets of coin tosses . . . , u^ k for generating the public/secret keys; and 

- 4 k sets of coin tosses u \, . . . , for generating the ciphertexts. 

That is, we have 4 k tuples of the form (pk,, c, , u' G , u l E ), i = 1, . . . , 4fc in each view. 
Since So, To, Si, Ti are fixed, each of these 4 k tuples are independently sampled from 
some distribution that only depends on the index i. Denote by X \ , . . . , X 4 the random 
variables for the 4 k tuples in Ho, and Y \, the random variables for the 4 k tuples 
in H ik . 

The hybrid argument. Next, we argue that X \ and Y, are computationally indistinguish- 
able for i = 1, . . . , 4 k, from which the indistinguishability of H 0 and H^ follows via 
a hybrid argument. There are several cases we need to consider: 

- i E T 0 or i € [4fc] \ (T 0 U So U Ti U Si). It is easy to verify that in either of these 
cases, X, and Y) are identically distributed. 

- * e Si (“oGen, oRndEnc = Gen, Enc”). Here, X* is the distribution 

(pk, c, f G ,f E | (pk, _L) = oGen(r G ),c = oRndEnc P - K (f E )} 
and Y t is the distribution 

{pk, c, rGen(r 0 ), rRndEnc(r G , r E , Mi) | (pk, sk) = Gen(r 0 ), c= Enc PK (Mi; r E )}. 

Indistinguishability follows immediately from the security of the trapdoor simulat- 
able PKE. 

- i G So \ To (“oGen, Enc = Gen, Enc”). Here, Xj is the distribution 

{pk, c, f G ,r E I (pk, _L) = oGen(f G ), c = Enc P * K (M 0 ; r E )} 
and Yj is the distribution 

{pk,c, rGen(r G ),r E | (pk, sk) = Gen(r 0 ),c= Enc PK (M 0 ; r E )}. 


298 


S.G. Choi et al. 


Indistinguishability follows again from the security of the trapdoor simulatable 
PKE. 

- i e Ti \ Si (“oGen, oRndEnc = Gen, oRndEnc”). Here, Xi is the distribution 
{pk, c, f G ,f E | (pk, _L) = oGen(r G ),c = oRndEnc P ~ K (f E )} 
and Y t is the distribution 

{pk, c, rGen(r 0 ), r E | (pk, sk) = Gen(r G ),c= oRndEnc PK (r E )}. 

Indistinguishability follows again from the security of the trapdoor simulatable 
PKE. 

Improving the Efficiency. Instead of using sets S,T c [4fc] of size k, we choose 
S, T c [40] of size 10. The previous analysis still goes through, except we now have a 
constant decryption error. To address this problem, we first encode the message^ with 
a linear-rate error-correcting code that corrects a constant fraction of errors, and then 
encrypt the codeword with the encryption scheme with constant error. 

6 Trapdoor Simulatable PKE from Hardness of Factoring 

Theorem 5. Suppose factoring Blum integers is hard on average, and that Blum 
integers are dense, then there exists a trapdoor simulatable PKE. 

For simplicity, we only present a 1-bit trapdoor simulatable encryption scheme; we may 
encrypt longer messages by encrypting bit by bit. 


A number-theoretic lemma. Fix any fc-bit integer modulus N and we will work with 
the group Z* N . We will use factor(TV) to denote the factorization of N , and we define 
Qn = {a 2 | a g Z* N }. Now, consider the map ipN ■ Qn — ► Qn given by if) n (x) = 
x 2 (mod N). As shown in II h 9 91 Facts 3. 5-3.7], i/9v defines a permutation on Qn- We 
provide a more direct proof which also yields an efficient algorithm to invert ipN given 
factor(TV). 

Claim. The map ipN defines a permutation on Qjv- 

Proof. Let q denote the largest odd divisor of f(N), where <p(-) is the Euler’s totient 
function. It is easy to see that <p(N) divides 2 k q, since N < 2 k . Take any y e Qn, 
where y = a 2 . Then by Euler’s theorem, y q = 1 (mod N) and thus ipN{y^ q+1 ^ 2 ) = 
y (mod N). Clearly, i/'i+Y/ 2 g Q N ,so the map tpN is surjective. Moreover, the range 
and domain of tpN have equal sizes, so ip n must define a bijection. □ 

The construction. We sketch the construction here; the formal construction is shown 
in Figure El 

7 The codeword length (or, equivalently the message length) should be O(k). Then, by Chemoff 
bound, the number of decryption errors remains a constant fraction of the codeword length with 
overwhelming probability. 
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Key generation Gen(l fe ): 

1. Run Bach’s algorithm using the randomness r G to sample random Ni, ... , N k 3 £ 
{0, 1 } k along with their factorization factor(lVi), . . . , factor(A r fc a). 

2. Set pk = [Ah, • ■ • , Al fc 3] and SK = [factor(Ah), . . . , factor(Al fc 3)]. 

Encryption Enc(6): 

1 . Parse the randomness r E as (01 , . . . , a k 3 ) € X • • • x Z ^ k3 , r \ , . . . , r k 3 £ {0, l} fe 
and 61, ... , b k 3_x € {0, 1}. 

2. Compute b k 3 = b © 61 © • • • © b k 3_ 1 . 

3. Compute Xi = 0% € Qjv*, * = 1, ■ ■ • , fc 3 . 

4. Output [war* ( Xi ), ri, (xi ■ n) © bi, i = 1, . . . , fc 3 ]. 

Decryption Dec(c): 

1. Parse c as [yi,n,Pi,i = 1, . . . , A; 3 ]. 

2. Compute bi = (7^* (yi) ■ r») ffi pi, i m% : . . . , fc 3 . 

3. Output bi © • • • © b k 3 . 

Oblivious key generation oGen(l fc ): 

1. Parse the randomness f a £ {0, l} fc4 asJVj,..., N k 3 £ {0, I}*. 

2. Output (Air, . . . , N k s). 

Trapdoor invertibility key generation rGen(r 0 ): 

1. Run Gen(r G ) to obtain f a = (Ah, . . . , A r fc s ). 

2. Output fa- 

Oblrvious samphng of ciphertexts oRndEnc(l fc ): 

1. Parse the randomness f E as (71, . . . ,7^,3) £ x- • -xZ^^, si, . . . , s k 3 £ {0, l} fc 

and Pi , . ..,/3 h 3 € {0, 1}. 

2. Compute yi = jf £ Qiv*, * = 1, . . . , fc 3 . 

3. Output [yi, Si, fa, i = 1, . . . , fe 3 ]. 

Trapdoor invertibility for ciphertexts rRndEnc(r G , r E , b): 

1. Use r G to compute factor(Ah), . . . , factor(Ah). and parse r E as in Enc. 

2. Set Si = ri and Pi = ( Xi ■ n) © bi, % = 1, . . . , fc 3 . 

3. Pick a random 74 uniformly from the set {7* £ Z%. | 7? = 7nv 4 (ac-i ) } . 

4. Output r E = (ji, . . . ,j k3 , si, . . . ,s k 3, Pi, . . . , P k s). 


Fig. 4. Trapdoor Simulatable PKE from hardness of factoring Blum integers 

Step 1: First, we construct a family of “weakly one-way” enhanced trapdoor 
permutations. We start by modifying ipw to obtain a new family of permutations 
7 Tjv; the modification is analogous to that in 110041 Section C.l] to obtain enhanced 
trapdoor permutations from Rabin’s trapdoor permutations. The permutations 7 Tjv : 
Qn Qn are indexed by a fc-bit integer N and is given by: 

itn(x) = f = % 2k+ (mod N ) 

and the trapdoor is factor(W). We may sample from this family by running Bach’s 
algorithm IIb88IIk() 2I1 to pick a random fc-bit integer along with its factorization. 

It is easy to verify 7 Tjv is a family of trapdoor permutations. Clearly, 7rjv is a 
permutation because it is the (fc + l)-fold iterate of a permutation U/y. Given the 
index N, ttn is efficiently computable by repeated squaring. Given the trapdoor 
factor(jV), 7 r^ 1 is efficiently computable given factor(A r ), by simply mapping y 
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to y((9 +1 )/ 2 ) fc+1 , i.e., raising y to the ( q + l)/2’th power k + 1 times. Here, q 
denotes the largest odd divisor of </>(N ), which is easy to compute with the trapdoor. 
Moreover, we can show that if TV is a Blum integer (which occurs with probability 
17(1 /k 2 ) Igm()4I[rs 941 1. then inverting ttn given N is at least as hard as factoring 
N. This implies that 7 Tjv is one-way with probability 0(1/ k 2 ) over the choice of N . 

Step 2: Construct a “weak” encryption scheme using the standard construction of 
PKE from trapdoor permutations via the Goldreich-Levin hard-core predicate. The 
public key is N, the secret key is factor(lV), and to encrypt a bit b, we pick a 
random x G Qn,t G {0, l} k and output (n N (x), r, (x ■ r) 0 b), where x ■ r is 
the standard dot-product of A; -bit strings. Again, this scheme will be semantically 
secure with probability 17(1 / k 2 ) over the choice of N. 

Step 3: To boost the security of the “weak” encryption scheme, we define a new 
scheme where the public key is k :i random fc-bit strings N±,..., N k 3 (with 
overwhelming probability, one of these is a Blum integer), and to encrypt a bit 
b, we pick random bi,...,b k 3 such that b = b± © • • • • b k 3 and concatenate 
the encryptions of bi, . . . ,b k s under the respective public keys Ni, , N k -s. By 
a standard argument (c.f. IIY82I IDP92I 1. this encryption scheme is semantically 
secure in the standard sense. 

Analysis. Indeed, we claim something stronger - that the encryption scheme derived 
in Step 3 is a trapdoor simulatable PKE. 

- (Oblivious sampling & trapdoor invertibility for key generation) This is trivial, 
since a random public key corresponds to a string in {0, 1 } ik . We can clearly 
sample such a public key without learning the secret key. 

- (Oblivious sampling & trapdoor invertibility for random ciphertext) For simplicity, 
we present the algorithms for sampling random ciphertext for the scheme obtained 
in Step 2. Here, sampling is easy: on input the public key N, pick 7 G Z* N , s G 
{0, l} fe , 0 G {0, 1}) and output (y 2 . s, 0). To implement reverse sampling, we 
need an efficient algorithm that given factor(iV) and x G Qn, output a random 
element of the set {7 G Z* N | 7^ = ttn(x) = x lk+ ' }. This can be accomplished 
as follows: pick a random r) G Z* N and output x 2 ■ r\j (rf )((®+i}/2) > where q is as 
before the largest odd divisor of <f>(N ). This works because r?/(r/ 2 ) ((9+ 1 )/ 2 ) will 
be a random 2 fc ’th root of 1 (mod N). 

For the actual proof of security, we will need to show that if N is a random Blum integer, 
then the following distributions are computationally indistinguishable for every b: 

{(iV, 7,7rjv(a;),r, (a: • r) © b)} and {(iV,7,7 2 \r,/3)} 

The first distribution corresponds to an encryption of b using modulus N and random- 
ness (x, r) along with 7 the output of rRndEnc (a random solution to the equation 
7 2 = 7Tjv(a;)). The second corresponds to an obliviously generated ciphertext along 
with the randomness. If there exists an efficient distinguisher, then there exists an 
efficient procedure A that on input N, 7, outputs ir^ 1 (7 s ) with noticeable probability. 
Since squaring is a bijection on quadratic residues modulo Blum integers, the output 
of A is also the 4th root of j 2 . We may then use a reduction in IIG041 Section C.l] to 
derive from A an algorithm for factoring N with noticeable probability. 
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7 Oblivious Transfer and MPC 

We describe the construction underlying Theorem [3 which proceeds in two steps: 

Step 1 : We begin with the Ict.os() 2I construction of a semi-honest OT protocol as 
applied to our non-committing encryption scheme, and observe that the protocol is 
secure against malicious senders. For that, we just need to show how to extract the 
sender’s input when the receiver is honest. In this case, the simulator will generate 
the public keys sent by the receiver in the first message along with the secret keys, 
so that it can then extract the malicious sender’s input by decrypting. 

Step 2: Next, we apply the compiler in llrnsMwOOl to “boost” the security guarantee 
from tolerating semi-honest receivers to tolerating malicious receivers. (Note that 
we will not need to apply OT reversal as in HcdsmwOOII .) 

Acknowledgements. We thank Ran Canetti, Yuval Ishai, Jonathan Katz, and Chris 
Peikert for helpful discussions and clarifications. 
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Abstract. We give a construction of non-malleable statistically hiding 
commitments based on the existence of one-way functions. Our construc- 
tion employs statistically hiding commitment schemes recently proposed 
by Haitner and Reingold [Q. and special-sound WI proofs. Our proof of 
security relies on the message scheduling technique introduced by Dolev, 

Dwork and Naor Q, and requires only the use of black-box techniques. 

1 Introduction 

A commitment scheme is an interactive protocol between two parties, the com- 
mitter, who holds a value, and the receiver. It usually consists of two phases: the 
commit phase and the reveal phase. During the commit phase, the committer 
puts a value in a “locked box” and sends it to the receiver. In the reveal phase, 
the committer sends the “key” to the receiver, then the receiver opens the box 
and retrieves the value. Two basic properties of a commitment scheme are the 
hiding property (the receiver cannot learn the committed value before the reveal 
phase) and the binding property (the committer is bounded to one value after 
the commit phase). There are two fundamental types of commitment schemes, 
statistical hiding and statistical binding. In this work, we focus mainly on sta- 
tistically hiding commitment schemes, where the hiding property holds against 
unbounded receivers while the binding property is required to hold only against 
polynomially bounded senders. 

The concept of non- malleability was first introduced by Dolev et al. 0. 
The basic properties of commitment schemes cannot prevent malleable attacks 
mounted by a man-in-the-middle adversary who has full control of the commu- 
nication channel between the committer and the receiver. Loosely speaking, a 
commitment scheme is non-malleable if one cannot transform the commitment 
of a value into a commitment of a related value. This kind of non-malleability 
is called non-malleability with respect to commitment j3]- The notion of non- 
malleability used by Di Crescenzo et al. 0 is called non-malleability with respect 
to opening , i.e., the adversary cannot construct a commitment from a given one, 
such that after having seen the opening of the original commitment, the adver- 
sary is able to correctly open his commitment with a related value. In the rest 
of this paper, when we say non-malleability, we actually mean non-malleability 
with respect to opening. 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 303 [il8,| 2009. 

© International Association for Cryptologic Research 2009 t 


304 


Z. Zhang et al. 


1.1 Related Work 

Statistically hiding commitment schemes were first shown to exist based on 
number-theoretic assumptions fold) . or more generally, based on any collec- 
tion of claw- free permutations P with an efficiently-recognizable index set 0. 
Subsequent work on constructing statistically hiding commitment schemes are 
based on collision-resistant hash functions |HJ, or based on any one-way permu- 
tation 032, or based on regular one-way functions m Nguyen et al. 02| and 
Haitner and Reingold [I] made fundamental progress by constructing statisti- 
cally hiding commitment schemes based on the minimal cryptographic assump- 
tion that one-way functions exist. 

Based on number-theoretic assumptions, non-malleable statistically hiding 
commitment schemes were designed in USE! assuming the existence of a common 
reference string that is shared by the two players before the protocol execution. 
Thus, their schemes do not work in the plain model (i.e., without setup assump- 
tions). More recently, Pass and Rosen [TI| constructed a non-malleable commit- 
ment scheme that was statistically hiding based on a family of collision-resistant 
hash functions. Their scheme is round-efficient and needs only constant-round 
communication. However, the security proof relies on non-black-box techniques 
and is not efficient. 

As one of the central goals of cryptography is to reduce complexity assumptions 
for various cryptographic primitives and construct them under more standard as- 
sumptions, there remain open questions as to whether or not non-malleable statis- 
tically hiding commitment can be based solely on the existence of one-way functions, 
and be shown secure relying only on black-box techniques. 

1.2 Our Result 

In this paper, we give affirmative answers to both of the questions posed above. 
We show that the existence of one-way function is a sufficient condition for the 
existence of non-malleable statistically hiding commitment. 

Theorem 1. If one-way functions exist, then there exists a non-malleable sta- 
tistically hiding commitment scheme. 

Our commitment scheme uses the commitment scheme P to commit to the 
desired value, but modify the opening process by adding a “trapdoor” that can 
be extracted and used by the simulator to cheat in the reveal phase, and would 
not be known to the committer in a real execution. Although the extraction 
requires rewinding, we rely on the message scheduling technique of Lin et al. ca, 
which is a slight modification of the message scheduling technique introduced 
by Dolev et al. j2|, to show this will suffice to prove the non-malleability. Our 
proof requires only standard black-box techniques. As a tradeoff, however, our 
protocol needs polynomial rounds of interaction. 

The preliminaries and definitions are illustrated in section P Our non-malleable 
statistically hiding commitment scheme is shown in sectional 
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2 Preliminaries and Definitions 

For any NP languages L, note that there is a natural witness relation Rl contain- 
ing pairs (a:, w) where w is the witness for the membership of x in L. A function 
/i(-), where g, : N — > [0,1] is called negligible if for every positive polynomial 
p(-), for all sufficiently large n £ N, p(n) < A probability ensemble is a 
sequence X = {Xjjier of random variables, where / is a countable index set 
and Xi is a random variable ranging over {0, l} p (W) for some polynomial p(-). 
Two probability ensembles X = {Xjjfgj and Y = {Y t }i e i are computationally 
indistinguishable, if no probabilistic polynomial-time (PPT) algorithm distin- 
guishes between them with more than negligible probability. For page limited, 
we assume the readers are family with interactive proofs. 

Special-sound proofs. A 3-round public-coin interactive proof for the language 
L € NP with witness relation Rl is special-sound with respect to Rl, if for any 
two accepting transcripts ( a , (3, 7) and (a', /?', 7') for some statement x £ L, such 
that a = ol and /3 ^ (3', a witness w such that (x, w) e Rl can be computed by 
a polynomial-time deterministic procedure. 

2.1 Witness Indistinguishability 

The concept of witness indistinguishability was proposed by Feige and Shamir 
fT7)l . An interactive proof system is witness indistinguishable (WI) if the verifier 
cannot tell which of the witnesses is being used by the prover to carry out the 
proof, even if the verifier knows both witnesses. We focus on NP languages L 
with a corresponding witness relation Rl. The readers are referred to [E! for 
formal definition. 

Special-sound WI proofs for NP languages can be based on the existence 
of non-interactive commitment schemes. Assuming only one-way functions, 4- 
round special-sound WI proofs for NP languages exist 0 More precisely, there is 
a 3-round special-sound WI proof for the language of Hamiltonian Graphs CCZ3, 
assuming one-way permutation families exist. If the commitment scheme used 
by the protocol El is replaced by Naor’s commitment scheme [El, then it 
becomes a 4-round special-sound WI proof while the assumption is reduced to 
the existence of one-way functions. For simplicity, we use 3-round special-sound 
WI proofs in our protocol though our proof works also with 4-round special- 
sound WI proofs. 

2.2 Commitment Schemes 

In this work, we consider statistically hiding commitment schemes. 

Definition 1 (Commitment Scheme). A pair of PPT interactive machines 
( C , R) is said to be a commitment scheme if the following two properties hold: 


A Around protocol is special sound if there exits polynomial-time deterministic 
procedure to extract the witness from any two accepting transcripts (r, a, [3, 7) and 
(r' t a, j3, 7) such that t = r',a = a 1 and f3 / /3'. 
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Statistical hiding: For every unbounded interactive Turing machine R* , it holds 
that the ensemble { st a[c-R>( Wl > 2: )}, ;i e{o i} n new ze{o i}* an ^ ensem ^ e 

{sta|p ^(w 2 ,z)}^ 2e ^ 0 1 j n „ eNze { 0 have negligible statistical difference^ 
where sta^ « (v, z ) denotes the random variable describing the output of R* 
after receiving a commitment to v using ( C,R }. 

Computational binding: A malicious (expected) PPT committer S* can suc- 
ceed in opening a given commitment in two different ways only with negligible 
probability. The reader is referred to USB for more details. 

2.3 Non-malleable Commitments 

As stated in H3> we formalize the notion of non-malleability by a comparision 
between a man-in-the-middle execution and a simulated execution. Just as Ena. 
we consider a tag-based variant of non-malleability. 

Let (C, R) be a commitment scheme. Let n G N be a security parameter. 
Let TZ G {0, 1}" x {0, 1}" be a polynomial-time computable valid relation m 
(i.e., for all v G {0,1}", lZ(v,±) = 0.). In the man-in-the-middle execution, 
the adversary A is simultaneously participating in a left and right interaction. 
In the left interaction, the man-in-the-middle adversary A interacts with the 
committer C to receive a commitment to a value v using tag tag. In the right 
interaction, A interacts with the receiver R and tries to commit to a related value 
using tag of its choice tag. After commit phase execution in both interactions, 
A receives decommitment keys from C and then generates the corresponding 
decommitment key for v. Prior to the interaction, the value v is given to C as local 
input. A receives an auxiliary input z , which might contain a priori information 
about v. If the right commitment or decommitment fails, or tag = tag, v is 
set to =_L. Let the boolean random variable mim ^ pe „lJZ,v,z) denote whether A 
succeeds. Note mim 0 4 pen (7?., v, z) = 1 if and only if A decommits to a value v such 
that lZ(v, v) = 1. 

In the simulated execution, a simulator S directly interacts with honest re- 
ceiver R. As in the man-in-the-middle execution, the value v is chosen prior 
to the interaction, and S receives some a prior information about v as part of 
its auxiliary input 2 r. S also receives tag tag. S first executes the commitment 
scheme with R. Once the commitment phase has been completed, S receives the 
value v and attempts to decommit to a value v with tag tag. If tag = tag, v 
is set to A. Let the boolean random variable sim ppen (7?., v, z) denote whether S 
succeeds. Note sim ppen (7?., v,z) = 1 if and only if S decommits to a value v such 
that R(v,v) = 1. 

Definition 2 (Non-malleable Commitment mi)- A commitment scheme 
{ C , R) is said to be non-malleable with respect to opening if for every PPT 
man-in-the-middle adversary A, there exists an expected PPT simulator S and a 
negligible function p : N — > [0,1], such that for every polynomial-time computable 

2 The statistical difference between two ensembles and {Vi}i6/ is defined by 

5-E a N** = «]-Pr[ri = «]|. 
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valid relation 1Z C {0, 1}" X {0, 1}", for all tags of polynomial length, for every 
v £ {0,1}" and every z £ {0. 1}*, the following holds: 

Pr[mim^ pen (7 Z,v,z) = 1] < Pr[simf pen (7£, v, z) = 1] + p{n) 

A commitment scheme that is non- malleable according to Definition El is liberal 
non-malleable rather than strict non-malleable EE:- Note we follow 0 in that 
non-malleability is guaranteed only if the commit phase and the reveal phase do 
not overlap. 

3 Construction 

We begin by presenting a high-level overview of our protocol. Our protocol is based 
on the statistically hiding commitment scheme P while relying on the messages 
scheduling technique m3 which is a slight modification of the message schedul- 
ing technique of (5j- The commit phase of our protocol is the same as that of 
the commitment protocol in p. The reveal phase, however, comes in two parts. 
Roughly, the reveal phase employs the two- witness technique by Feige m and the 
well known FLS-technique m- First, the receiver proves that it knows one of the 
preimages of either element so or element s i computed by itself in the domain of a 
one-way function. Then, the committer sends the committed value v and proves it 
knows how to open the commitment or one of the preimages of either element sq 
or element si. The proofs used by the prover and the verifier are all tag-based WI 
proofs elaborately scheduled as 0- For simplicity of exposition, our description 
relies on the existence of one-way functions with efficiently recognizable rangeH 
We also assume the one-way function is length-preserving. Since any one-way func- 
tion can be transformed into length-preserving one-way function ESI- 

3.1 Tag-Based Witness-Indistinguishable Proof 

First, we propose a tag-based WI proof for every NP language L which is used as 
a basic tool in the final commitment scheme. The length of the tag is polynomial 
bounded to the length of the security parameter n. Denote the polynomial by 
f(-). In Fig. [H both design 0 and desigrq contain two executions of special-sound 
WI proofs for L but with elaborately designed scheduling. The tag-based WI 
proof (P tag , ktag) for L is shown in Fig. |3 The protocol is composed of 47-round 
special-sound WI proofs for language L. More precisely, there are t rounds, where 
in round j, the schedule design tagj is followed by designi_ tagj . The properties of 
(P tag , Vtag) are easy to verify. The details are omitted. 

One basic technique in proving the security of most zero-knowledge and com- 
mitment protocols is standard rewinding. However, the rewinding technique is 
problematic when extending to concurrent (here one-left one-right) execution en- 
vironment as an adversary may adaptively schedule its messages that withstand 
any targeted simulator (i.e., the simulator may run super-polynomial time or is 

3 The protocol can be easily modified to work with arbitrary one-way function by pro- 
viding a witness hiding proof that an element is in the range of the one-way function. 
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Fig. 1. Two schedules Fig. 2. Tag-based WI proof (P t ag , 

Hag) 


Protocol (Ptag,Vtag) 

Security Parameter: 1" 

Common Input: An instance x € {0, 1}" 
Tag string: tag G {0, l} 4 ^ 

For j = 1 to t(ri) 

P V: Execute designta gj 
Execute designi_ ta g 


exposed to malleability attack.). Considering the non-malleability property for 
commitment schemes, the pivot is to design the stand-alone simulator that sat- 
isfying Definition |21 Here we also come up with the problem of how to simulate 
when the adversary adaptively schedules its messages. 

The scheduling in Fig. 0 which is identical to [E! is vital in achieving the non- 
malleability. The main advantage of this scheduling is that for the proof given 
by a man-in-the-middle adversary, there exists a point at which the adversary 
cannot answer the challenge from the verifier by simply modifying the proof on 
the other side (provided the tag of the proof is different from that of the proof 
on the other side.). 

Related to the above scheduling is a notion called safe-point, from which it is 
possible to perform extraction by standard rewinding until we obtain a second 
proof transcript, without “affecting” the other side interaction. Below is the 
formal definition of safe-point, which is mainly taken from ESI and abridged to 
our setting. 

Definition 3 (Safe-point jlSj). A prefix p of a transcript t is called a safe-point, 

if there exists an accepting proof (a r ,f3 r , 7 r ) in the right interaction, such that 

1. a r occurs in p, but not p r (and 7 r ). 

2. For any proof (ai,/3i, ji) in the left interaction, if only op occurs in p, then 
Pi occurs after 'yr. 

When protocol (Pta gl hta g ) is rim concurrently, it is guaranteed there is a safe- 
point for right interaction that has a tag different from the left interaction fol- 
lowing from the next lemma. 

Lemma 1 (Safe-point Lemma [15®. In any one-one man-in-the-middle ex- 
ecution of (Ptag, Vt ag ) , if the right interaction has a different tag from the tag of 
the left interaction, there exists a safe-point for the right interaction. 

4 The safe-point lemma in applies to any one-many concurrent execution environ- 
ment, where the adversary participates in one left interaction and polynomial many 
right interactions. Here we use a simpler version of the safe-point lemma, where the 
adversary participates in one left interaction and one right interaction. 
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3.2 Non-malleable Statistically Hiding Commitment Scheme 

Let (SHC, SHR) be the statistically hiding commitment scheme^ from any one- 
way function 0 and let (P t ag , Vtag} be a tag-based WI proof for NP. The commit- 
ment protocol is shown in Fig. 01 The length of the tag is m(n). Our construction 
in fact compiles any statistically hiding commitment scheme with non-interactive 
reveal phase into a non-malleable statistically hiding one with interactive reveal 
phase, assuming the existence of one-way functions. 


Protocol (C, R) 

Security Parameter: l n 
Tag string: tag £ {0, l} m(n) 

String to be committed: v £ {0, l} n 
Commit Phase: 

C R \ Run the commit phase of commitment scheme (SHC, SHR), where C 
runs SHC and R runs SHR. 

R : Abort if the above commit phase fails. 

Let com be the transcripts of messages obtained. C records the decommitment 

Reveal Phase: 

Stage 1: 

R —* C \ Pick uniformly ro,n £ {0, l} n , compute so = f(ro) and si = /(n) 
and send so, si. 

R O- C : R and C engage in an execution of ( f tag , Vtag) with tag tag, where 
R uses rb as witness (6 £ {0, 1}) and runs P tag to prove to C (running Vtag) 
knowledge of a value r s.t. so = f(r) or si = f(r). The challenge length of the 
verifier (i.e., C) is 2n. 

C: Abort if either so or si is not in the range of / or the proof fails. 

Stage 2 :C^R: Send v. 

Stage 3: 

C 4$ R : C and R engage in an execution of (Ptag, Vtag) with tag tag, where C 
runs P tag to prove to R (running Vtag) that there exists a value dec s.t. dec is 
the valid decommitment key of com corresponding to v or there exists a value 
r s.t. so = fir) or si = f(r). The challenge length of the verifier (i.e, R) is 2 n. 


Fig. 3. Non-malleable statistically hiding commitment scheme (C, R) 


Theorem 2. Suppose that (SHC, SHR) is a statistically hiding commitment 
scheme with non-interactive reveal phase and (P t ag, Vtag) is a tag-based WI proof. 
Then ( C , R) is a non-malleable statistically hiding commitment scheme. 

Remark 1. The commitment scheme shown in Fig. 0 is tag-based non-malleable. 
Compared with existing tag-based commitment schemes [211 hl22| , it seems a bit 


5 Note the commitment scheme P is only for a single bit. By running their scheme in 
parallel, we obtain a commitment scheme of any polynomial length. Hence, we also 
assume that the basic statistically hiding commitment scheme is for a string. 
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strange that our construction uses tags only in the reveal phase. In fact, this 
approach is inspired by the work of j 1 411 . Even tag-based non-malleable com- 
mitments can be transformed into content-based non-malleable commitments in 
a standard way j2], we explicitly present one in Appendix El for reference. 

Remark 2. The high level approach of our commitment scheme is to combine H3 
with jaiSj . That is, to commit to v, in the commit phase, a sender commits v 
using the statistically hiding commitment scheme [I], and in the reveal phase, 
a sender sends v and proves using a “simulation-extractable” argument j2H5j 
that the commit phase transcript opens to v. The simulation strategy at a high 
level is from M ■ For technical reasons, naively using the simulation-extractable 
arguments from ECnidoes not work. We need to modify the opening process by 
adding a “trapdoor” that can be extracted and used by the simulator to cheat 
in the reveal phase. This is the reason why we add one more phase (i.e., Stage 
1). Whereas in jlf I 1 oj . the trapdoor is only used in the hybrid experiment for 
analysis and may therefore hard-wired via a different analysis. 

Proof (sketch). We need to prove the scheme satisfies the following three proper- 
ties: statistical hiding, computational binding and non-malleability with respect 
to opening. We start by proving the hiding and non-malleability properties and 
then return to the proof of the binding property. 

Statistical hiding. The hiding property follows directly from the hiding property 
of the commitment scheme (SHC,SHR). Note that (SHC,SHR) is statistically 
hiding, and so (C, R) is also statistically hiding. 

Non-malleability. We show that for every PPT man-in-the-middle adversary 
A, there exists a probabilistic expected polynomial-time simulator S and a 
negligible function p such that for every polynomial-time computable relation 
TZ C {0, 1}" x {0, 1}", for every tag tag of length m(n), for every v G {0, 1}" 
and every z G {0, 1}*, it holds that 

Pr[mim;f pen (7£, v, z) = 1] < Pr[simf pen (7£, v, z) = 1] + /i(n) (1) 

Denote by A rev the state of A after the the commit phase, i.e., A rev contains A’s 
description along with its configuration at that time just before the reveal phase 
starts. 

We proceed to describing the simulator S. S on input z and security parameter 
1" interacts with an honest receiver R and runs the adversary A internally. Dur- 
ing the commit phase, on a high level, S internally incorporates A and emulates 
the commit phase of the left execution for adversary A by honestly commit- 
ting to 0", while externally relaying messages in the right execution between A 
and R. 

Once the commit phase is finished, S receives a value v and has to perform 
the reveal phase internally with A rev . In Stage 1, S plays as an honest sender in 
the left reveal phase and as an honest receiver in the right reveal phase. Once 
the simulation of Stage 1 completes, S applies the safe-point lemma to find a 
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safe-point and extract a witness w to the statement proved by A rev in the left 
reveal phase by standard rewinding!! In Stage 2, S just sends v to A rev in the 
left reveal phase. Then the simulation for Stage 3 begins. S uses a fake witness 
(i.e. the trapdoor w) to simulate the left interaction for A rev , while emulating 
the right interaction as an honest receiver. When the simulation for Stage 3 
completes, S again applies the safe-point lemma to find a safe-point and ex- 
tract a witness w (i.e., the decommitment keys of A) in the right interaction. 
Finally, by using ui, S can complete the reveal phase of the external execution 
with R. 

More formally, S proceeds as follows on auxiliary input z and tag tag: 

1. S internally incorporates A(z). 

2. During the commit phase S proceeds as follows: 

(a) S internally emulates left interaction for A by honestly committing to 
0 ”. 

(b) Messages from right execution are forwarded externally to R. 

3. Once the commit phase has finished, S receives the value v. Let com, com 
denote the left and right execution transcripts respectively. 

4. During the reveal phase S internally incorporates A rev and proceeds as fol- 
lows: 

(a) Stage 1 Main Execution Phase: S emulates a one-one man-in-the- 
middle execution by playing as honest sender with tag tag on the left 
and as honest receiver on the right. After completing the execution, 
denote by A the transcripts of messages obtained. Denote the right 
tag by tag. We emphasize here that S can emulate left interaction 
independent of v in Stage 1. 

Stage 1 Rewinding Phase: Next, S attempts to extract the witness 
used by A rev on the left if tag ^ tag. 

i. In A, find the first point p that is a safe-point. Let the associated 
proof be (a P , f3 p ,'y p ). 

ii. Repeat until a second proof transcript (a p , ff p ,~{ p ) is obtained: 
Emulate the left interaction as in the Stage 1 Main Execution 
phase. For the right interaction: 

- If A rev expects to get a new proof from the right receiver, S 
then emulates the proof by generating design 0 himself. Forward 
one of the two proofs internally. 

- If A rev sends a challenge for a proof whose first message occurs 
in p\ cancel the execution, rewind to p and continue. 

iii. If fi p ^ fi' p , extract and record the witness w from (a p . fi p , '•f p ) 
and (a p , (3 p , j p ). Otherwise halt and output fail. 

Finally, if the above (i.e. stepEU) runs for more than 2" steps, haft and 
output fail. 


In Stage 1 , the committer acts as a prover and the receiver acts as a verifier. The 
safe-point and safe-point lemma still work by interchanging right and left. 
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(b) Stage 2: Send v to the adversary A rev . 

(c) Stage 3 Main Execution Phase: By using w as witness, S can easily 

simulate left interaction for A rev . The right interaction is emulated by 
S adopting honest receiver strategy. After completing the execution, 
denote by A' the transcripts of messages obtained in the execution 
of Stage 2 and Stage 3 . 

Stage 3 Rewinding Phase: S attempts to extract the decommitment 
key of A rev on the right: 

i. In A', find the first point p that is a safe- point. Let the associated 
proof be (Stp,0p,'fp). 

ii. Repeat until a second proof transcript (dtp, is obtained: 

Emulate the right interaction as in the Stage 3 Main Execution 
Phase. For the left interaction: 

- If A rev expects to get a new proof from the committer, S is free 
to answer the request by using the witness w, except when A rev 
sends a challenge for a proof whose first message occurs in p, 
S cancels the execution, rewinds to p and continues. 

iii. If 0 p ± ffp, extract a witness w from (dp, fJp. 7 p) and (dp, /3~, 7^). 
Otherwise halt and output fail. 

iv. If w is a valid decommitment key for (SHC, SHR), i.e., (corn, w,v) 
is a legal transcript for (SHC, SHR), set rev = w. Otherwise halt 
and output fail. 

Finally, if the above (step SEI) runs for more than 2" steps, halt and 
output fail. 

(d) If the right interaction is accepting and tag tag, and rev contains a 
valid decommitment key, run the honest committer strategy on input 
coin and decommitment key rev, value v with tag tag. 

Running time of S. We show that the running time of S is expected PPT. 
Note the time spent by S in the commit phase is poly(n). After S extracts the 
witness w, the time spent by S in stoo l Idl is also poly(n). Next, we show that the 
expected time spent by S in the reveal phase (except running time in step 14711) is 
also poly(n). For simplicity, we assume that S does not check the fail condition 
and may run for more than 2" steps (since this only increases the total running 
time). 

Recall that in the reveal phase, S rewinds A from two safe points. We need 
to show the time spent in step 03 and step 03 are all expected PPT. We first 
analyze the time spent in step 03 during the simulation. Then using the same 
method, we show that the time spent in step 03 is also expected PPT. 

Note the time spent by S in the Stage 1 Main Execution Phase is poly(n). We 
then show the time spent in Stage 1 Rewinding Phase is expected PPT. The anal- 
ysis hereafter is similar to that in na but is simpler. Let T(i) be the random vari- 
able that describes the time spent in rewinding a proof after i messages have been 
exchanged. We show that E[T(i)] < poly(n) and then by linearity of expectation, 
we conclude that the expected time spent by S in the Stage 1 Rewinding Phase 
is E * EPX*)] < Ei Poly(n) < poly(n). 
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Next we will bound the time E[T(i)]. Given a partial transcript of messages 
p, let Pr[p] denote the probability that p occurs as a prefix of the execution 
emulated in Stage 1 Main Execution Phase. Let p p denote the probability that p 
is a safe-poinlQ and is rewound. From the construction of S, we know that S 
keeping rewinding until it finds another accepting transcript {a p . B' p . 7'J for p, 
canceling each rewinding for which p is not a safe-point, i.e., A rev requests the 
second message of a proof in the right-interaction whose first message occurs in p. 
As the emulated committer and receiver act identically as real committer and real 
receiver in this stage, conditioned on p, a view occurring in a rewinding from p is 
same as occurring in the Stage 1 Main Execution Phase. Thus, the probability of 
canceling a rewinding from p is at most 1 — p p . Furthermore, the expected number 
of rewindings is at most A-. Therefore, the expected number of rewindings from 
p is at most p p ■ = 1 and each rewinding takes at most poly(n) steps, i.e., 

E[T(i)|p] < poly(n). Thus, 

E[T(01- £ E[T(i)|p] • Pr[p] < poly(n) • £ Pr[p] < poly(n) 

p of length i p of length i 

The expected running time of S in stcn Hcl is also polynomial-time using similar 
analysis as above. We omit the details. 

Analysis of the simulator S. In order to show equation (|TJ) . we define a hy- 
brid stand-alone simulator HYBi that also receives v as auxiliary input. HYBi 
proceeds exactly as S except that in the commit phase, instead of feeding A a 
commitment to 0", HYBi feeds A a commitment to v. 

Since both the experiment S and HYBi are efficiently computable, the follow- 
ing claim follows directly from the hiding property of (SHC, SHR). 

Claim 1. There exits some negligible function p! such that 

|Pr[simf pen (7£,t;,.z) = 1] - Prfsim^ 1 ^, v > z ) ~ 1] | < p'(n) 

Next we proceed to showing the following claim. 

Claim 2. There exists some negligible function p" such that 

|Pr[mim^ )en (7?., v, z) = 1] — Pr[sim^ 1 (7^,u,^5 = 1|— -fail] | < p"(n) 

Proof (sketch). Note the view of A in the commit phase in a real interaction is 
identical to the view of A in HYBi. Furthermore, HYBi feeds A messages ac- 
cording to the correct distribution in Stage 1, the view of A rev in the simulation 

7 Note the roles of C and R interchange in Stage 1 where C acts as a verifier and R 
acts as a prover. The safe-point lemma will be used by interchanging the right and 
the left. 
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of Stage 1 by experiment HYBi is identical to the view of A rev in a real interac- 
tion. The view of A rev in the simulation of Stage 3 by HYBi is computationally 
indistinguishable following from the witness-indistinguishability of {Ptag-, U tag ). 
As the safe-point lemma shows, when the right interaction has a different tag 
from the left interaction, there is a safe-point. Hence, according to the actions 
of HYBi, it will either output fail or succeed in the extraction from A rev . Con- 
ditioned on HYBi not outputting fail, by the computational-binding property of 
(SHC,SHR), except with negligible probability, the witness w and the value v 
extracted by HYBi are the valid decommitment key and committed value of A, 
respectively. 

We next show IPrlsim]^ 1 (■£.,«,«) = 1] - Pr[sim opXn 1 (*£> «, «) = 1 Mail] I is negli- 
gible by proving that the probability that event fail happens is negligible. This 
together with Claim [□ and Claim El conclude Eq. £Q). 

Claim 3. HYBi outputs fail with negligible probability. 

Proof. The proof of this claim is similar to that of El- More precisely, HYBi 
outputs fail only in three cases: HYBi runs for more than 2" steps; or the same 
proof transcript is obtained from some safe-point; or the witness extracted is 
not a valid decommitment. The arguments of the first two cases are almost 
the same as those in B3- The main difference lies in the analysis of the third 
case. 

HYBi runs for more than 2 n steps: We know that the expected running time 
of HYBi and S are same, i.e., poly(n). Using Markov inequality, we con- 
clude that the probability that HYBi runs more than 2" steps is at most 
poly(n) 

2" ’ 

The same proof transcript is obtained from some safe-point: This case 
occurs if HYBi picks some challenge (3 (resp. 0) in Stage 1 (resp. Stage 3) 
Rewinding Phase that appeared as a challenge in the Stage 1 (resp. Stage 
3 ) Main Execution Phase. As HYBi runs for at most 2" steps, it picks 
at most 2" challenges. Furthermore, the length of each challenge is 2 n. 
By applying the union bound, we obtain that the probability that a 0 
(resp. 0) is picked twice is at most Since there are at most polyno- 
mial many challenges in Stage 1 (resp. Stage 3), using union bound again, 
we conclude that the probability that it outputs fail in this case is negligi- 
ble. 

The witness extracted is not a valid decommit ment:0 Suppose, on the 
contrary, the witness extracted is not the decommitment key for (SHC, SHR), 
then by the special-sound property, it follows that it must be a value r' 

8 The proof in this case heavily relies on the “simulation-extractability” property of 
fPtag,ktag) in Stage 1. An ordinary WI proof of knowledge is not suffice here, as 
the problem in this case is reduced to the security of one-way functions or witness- 
indistinguishability of underlying subprotocols, in the presence of an expected PPT 
adversary who can rewind the same subprotocols. 
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such that f(r') = Sb> for some b' £ {0, 1}. Denote by rt, ( b £ {0, 1}) the 
witness used by HYBi in Stage 1 of right interaction. If b' = 1 — b, then 
we can break the one-way function /. Given A, z and v, we construct an 
algorithm B that inverts /. The input to B is an n-bit string y = f{x) 
where x was chosen randomly from {0, 1}". B wants to output a pre-image 
of y under f. B proceeds as follows: B runs identically as HYBi with in- 
puts z, v with the exception that when simulating the right receiver for A 
in Stage 1 of reveal phase, it picks a random bit b £ {0, 1} and a ran- 
dom string Tb £ {0,1}”, and sets Sb = f(rb),si-b = y ■ By using rb as 
witness, it can simulate the right interaction with A rev easily. Finally, if 
B extracts a witness r' where f(r’) = y, then we break the one-wayness 
of /. The probability that B inverts / is identical to the probability that 
HYBi inverts / which is non-negligible. This contradicts the one-wayness 
of /• 

We therefore have only to deal with the case that B always outputs r' 
such that f(r') = Sb, i.e., B always outputs same preimage it knows. Then 
we can break the witness indistinguishability of the underlying special-sound 
proofs as follows: Recall that the proof (P tag , Vtag) in Stage 1 of right interac- 
tion contains 4m number of special-sound WI proofs. The above assumption 
is that B always extracts the same preimage used by itself in Stage 1 of right 
interaction. We know that if the 4m number proofs use ro, B outputs ro, 
and if the 4m number proofs use r\, B outputs r\. Applying standard hybrid 
arguments, there exists i £ [4m] , by using ro for the first i — 1 proofs and n 
for the last 4m — i proofs, the witness used in the i-th special-sound proof is 
the same as that of the witness extracted by B. We can use this session to 
break the witness-indistinguishability of special-sound WI proof. The prob- 
ability we break the witness-indistinguishability property of the underlying 
special-sound proof is ^ times the probability that HYBi inverts / which 
is non-negligible. This contradicts the witness-indistinguishability property 
of the underlying special-sound proof. 

Computational binding. The binding property intuitively follows from the bind- 
ing property of the underlying commitment scheme (SHC, SHR) and the special- 
sound property (or more precisely proof of knowledge property) of the underlying 
proof in (Ptag, Vta g }- A formal proof proceeds along the lines of the proof of non- 
malleability. More precisely, suppose, there exists an adversary A that can violate 
the binding property of (C, R), then we design an algorithm A! that violates the 
binding property of (SHC, SHR). A! incorporates A and relays the commit phase 
messages to an external honest receiver SHR. In the reveal phase, there is no 
need of A! to simulate the left interaction for A. Note in the non-malleability 
proof, two extraction are executed. Here, we only execute one extraction by 
standard rewinding, and obtain the decommitment key. Using this information, 
A’ can easily complete the reveal phase with SHR. It follows from the witness- 
indistinguishability property of (P ta g ,Utag) that the probability that A ' breaks 
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the binding property of (SHC,SHR) is negligible close to the probability that A 
breaks the binding property of ( C,R }. 

Schedule of messages: In the non-malleability proof, the design of S is based on 
an unspecified assumption, i.e., in the reveal phase, Stage 3 on both interactions 
will not start unless the simulations for Stage 1 are completed. Without loss of 
generality, this assumption is reasonable. 

Consider the scenario where the simulation for Stage 1 of the left interaction 
and Stage 3 of the right interaction overlap. The simulation goes well as the 
adversary runs as a prover in Stage 3 of the right interaction, and the rewinding 
of Stage 1 of the left interaction will not “rewind” the Stage 3 of the right inter- 
action (i.e., the adversary can only answer the left challenge by itself, without 
the help from the right interaction). By using the safe-point lemma, the simula- 
tor can still find a safe- point and extract the witness to the statement proved 
by the adversary by standard rewinding. Furthermore, the adversary also runs 
as a prover in Stage 1 of the left interaction, and the rewinding of Stage 3 of 
the right interaction will not “rewind” the Stage 1 of the left interaction. Due 
to a more simpler but similar reason, when the simulation for Stage 3 of the 
left interaction and Stage 1 of the right interaction overlap, the simulator has 
no difficulty and the two extractions also performs well. We take a special note 
of the fact that the safe-point lemma depicts the existence of safe-point in any 
one-one concurrent execution environment, and considers an environment where 
one-side of the interaction is empty as a special case. 
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A A Content-Based Non-malleable Commitment Scheme 

Let (SHC, SHR) be the statistically hiding commitment scheme P from any one- 
way function and let (P ta g> Itag) be a tag-based WI proof for all NP. Let SS = 
(SG, Sig, SVer) be a secure signature scheme. The content-based non-malleable 
statistically hiding commitment scheme is shown in Fig. 01 Due to page limit, 
the formal proof is omitted here. 
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Protocol (C, R) 

Security Parameter: l n 

String to be committed: v €. {0, l} n 

Commit Phase: 

C O R \ Run the commit phase of commitment scheme (SHC, SHR). 

R : Abort if the above commit phase fails. 

Denote the above transcript as com. C records the decommitment key in dec. 
Reveal Phase: 

Stage 1: 

R — ► C : Set (pko, sko) *— SG(1”) and send pko ■ 

R — > C : Pick uniformly ro,n € {0, 1}”, compute so = f(r o) and si = f(ri) 
and send so, si. 

C : R and C engage in an execution of (P p k 0 , V p k 0 ) with tag pko , where 
R uses n as witness (6 € {0, 1}) and runs P p k 0 to prove to C (running V p k 0 ) 
that there exists a value r s.t. so = f(r) or si = f(r). The challenge length of 
the verifier (i.e., C) is 2 n. C aborts if either so or si is not in the range of / 
or the proof fails. 

R — > C : Let tro be the transcript so far. Set ao <— Sig(tro, sko) and send no- 
C : Abort if Sver(pfco, tro, (To) ^ 1. 

Stage 2: C — ► R : Send v. 

Stage 3: 

C — > R : Set (pki, ski) *— SG(1”) and send pki. 

C R \ C and R engage in an execution of {Ppk^ , V^i ) with tag pki , where 
C uses witness dec and runs P p k ± to prove to R (running VpkJ that there 
exists a value dec s.t. dec is the decommitment key of com corresponding to v 
or there exists a value r s.t. so = f(r) or si = f(r). The challenge length of 
the verifier (i.e., R) is 2n. 

C — > R : Let tri be the transcript so far. Set cri *— Sig(tri, sfci) and send oi . 
R : Abort if Sver(pfei,tn,cri) / 1. 


Fig. 4. Non-malleable statistically hiding commitment scheme (C, R) 
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Abstract. Proofs of storage (PoS) are interactive protocols allowing 
a client to verify that a server faithfully stores a file. Previous work 
has shown that proofs of storage can be constructed from any homo- 
morphic linear authenticator (HLA). The latter, roughly speaking, are 
signature/message authentication schemes where ‘tags’ on multiple mes- 
sages can be homomorphically combined to yield a ‘tag’ on any linear 
combination of these messages. 

We provide a framework for building public-key HLAs from any iden- 
tification protocol satisfying certain homomorphic properties. We then 
show how to turn any public-key HLA into a publicly- verifiable PoS with 
communication complexity independent of the file length and supporting 
an unbounded number of verifications. We illustrate the use of our trans- 
formations by applying them to a variant of an identification protocol by 
Shoup, thus obtaining the first unbounded-use PoS based on factoring 
(in the random oracle model). 


1 Introduction 

Advances in networking technology and the rapid accumulation of information 
have fueled a trend toward outsourcing data management to external service 
providers (“servers”). By doing so, organizations can concentrate on their core 
tasks rather than incurring the substantial hardware, software and personnel 
costs involved in maintaining data “in house” . 

Outsourcing storage prompts a number of interesting challenges. One prob- 
lem is to verify that the server continually and faithfully stores the entire file f 
entrusted to it by the client. The server is untrusted in terms of both secu- 
rity and reliability: it might maliciously or accidentally erase the data or place 
it onto temporarily unavailable storage media. This could occur for numerous 
reasons including cost-savings or external pressures (e.g., government censure). 

* Portions of this work done while at Johns Hopkins. 

** Portions of this work done while at IBM. Research supported by NSF grant 
#0426683. 
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The server might also accidentally erase some data and choose not to notify the 
client. Exacerbating the problem (and precluding naive approaches) are factors 
such as limited bandwidth between the client and server, as well as the client’s 
limited resources. See nnn for a more thorough discussion. 

If we allow communication complexity linear in f, there is a simple mechanism 
allowing the client to verify that the server stores f at any given time: When 
the client uploads /, the client locally stores a hash of /; to verify, the server 
simply sends all of f and the client checks that this hashes to the correct value. 
For our purposes, we are interested in solutions with communication complexity 
that is much smaller than (and, ideally, independent of) the file size. 

Ateniese et al. P and Juels and Kaliski m independently introduced ap- 
proaches to this problem having sub-linear communication complexity. (Earlier 
work by Naor and Rothblum na is related, but considers a somewhat weaker 
adversarial model.) Ateniese et al. also distinguish between the case of private 
verifiability, where only the original client (or anyone with whom that client 
shares a key) can verify the server’s storage, and public verifiability, where any- 
one knowing the client’s public key can perform verification. Extensions and 
improvements were given by Shacham and Waters m, Dodis, Vadhan, and 
Wichs jSj, and Bowers, Juels, and Oprea |0j. We refer to jS] for a more detailed 
comparison among the existing schemes. 

Here, we are interested in publicly-verifiable schemes that can be used for an 
unbounded number of verifications. A useful tool for this, implicit in P and 
further studied in H3E1, is a homomorphic linear authenticator (HLA), which 
can be defined in either the private- or public-key setting. Roughly speaking, 
this primitive allows a client to ‘tag’ each block /) of a file / = /i| • • • \f n in such 
a way that for any vector c the server can homomorphically construct a (short) 
tag authenticating the value c* ' /»• 

Two recent works have considered the dynamic setting, where the remotely- 
stored data can be updated 00 - We do not address this problem here. 

1.1 Our Contributions 

The main contribution of this paper is to show a general mechanism (in the ran- 
dom oracle model) for constructing publicly-key HLAs from any identification 
protocol that is suitably homomorphic. The RSA-based HLA used by Ateniese 
et al. P (see also [IH Appendix E]) can be viewed as an instance of our mech- 
anism applied to the Guillou-Quisquater cm identification protocol; similarly, 
the Shacham- Waters scheme m can be seen as being derived from an under- 
lying identification protocol in bilinear groups. By applying our transformation 
to a variant of Shoup’s identification scheme based on factoring m, we ob- 
tain the first publicly-verifiable HLA based on factoring (in the random oracle 
model). 

We also show a generic transformation from any HLA to a publicly-verifiable 
proof of storage with communication complexity independent of the file size. This 
transformation is in the standard model, and answers an open question from d. 
An analogous transformation with similar properties was shown (independently) 


Proofs of Storage from Homomorphic Identification Protocols 321 


by Dodis et al. jS] in the setting of simpler private verifiability; our technique is 
different from theirs and is of independent interest. 

Combining our results, we obtain a publicly-verifiable proof of storage based 
on the factoring assumption in the random oracle model. In our PoS, the com- 
munication complexity and the size of the client’s state are independentQ of the 
file size, and the server’s storage is a constant multiple of the file size. In the PoS 
we describe, the computation of both the client and the server is linear in the file 
size, but notice that public-key HLAs can be layered on top of erasure codes (as 
in fl4!4j l or used in conjunction with a probabilistic approach for multiple audits 
(as in |U) to obtain better performance while retaining public verifiability. 

2 Definitions 

We write x «— X to represent an element x being sampled uniformly at random 
from a set X. The output y of a randomized algorithm A running on input x is 
denoted by x <— A(x). We sometimes write y := A(x: r) to denote the (deter- 
ministic) result of running A on input x and random coins r. We use boldface 
to denote vectors. Given a vector v we let Vi denote its ith component. 

Throughout, k £ N denotes the security parameter. A function v : N — > K is 
negligible if for every polynomial p(-) and large enough k, we have v(k) < 1 /p(k). 

2.1 Homomorphic Linear Authenticators 

Homomorphic linear authenticators (HLAs) were introduced by Ateniese et al. |U 
as a building block for constructing communication-efficient proofs of storage; 
they were further studied in 1 1 Hoi . At a high level, HLAs are used as follows: 
viewing the file f as an n-dimensional vector, the client begins by tagging each 
element of f and then sending both f and the vector of tags t to the server. To 
verify that the server is storing the entire file, the client sends a random challenge 
vector c and the server returns y = JA <k • fi along with a tag r, computed using 
/,t, and c, which is supposed to authenticate this value. 

HLAs can be defined both in the private and public-key settings. We give a 
definition for public-key HLAs and refer the reader to jSJ for a formalization of 
private- key HLAs. 

Definition 1 (Homomorphic linear authenticator). A public-key homo- 
morphic linear authenticator is a tuple of four ppt algorithms (Gen, Tag, Auth, 
Vrfy) such that: 

( pk,sk ) *— Gen(l fc ) is a probabilistic algorithm used to set up the scheme. It 
takes as input the security parameter and outputs a public and private key 
pair ( pk , sk). We assume pk defines a k-bit prime p and a positive integer B. 
( t , st) *— Tag a k(f) is a probabilistic algorithm that is run by the client in order 
to tag a file. It takes as input a secret key sk and a file f e [B] n , and outputs 
a vector of tags t and state information st. 

1 The communication complexity for a file of size n is 0(log n + k). and as in [£| we 
assume k log n. 
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t := Auth p k(f,t,c) is a deterministic algorithm that is run by the server to 
generate a tag. It takes as input a public key pk, a file f £ [B] n , a tag 
vector t, and a challenge vector c £ Z”; it outputs a tag t. 
b := Vrfy pk (st, p,c,r): is a deterministic algorithm that is used to verify a tag. 
It takes as input a public key pk, state information st, an element p £ N, 
a challenge vector cgZ p , and a tag r. It outputs a bit, where ‘1’ indicates 
acceptance and ‘0’ indicates rejection. 

For correctness, we require that for all k £ N, all ( pk,sk ) output by Gen(l fc ), all 
f £ [ B] n , all ( t , st) output by Tag sk (f), and all c £ Z£, it holds that 



We remark that in certain schemes correctness (and security) may hold even 
when Vrfy is given only JTcj/i m °d p (assuming B < p). In such cases the 
communication from the server to the client can be further reduced. 

Informally an HLA is secure if, for a given file f and challenge vector c, no 
adversary can output a valid authenticator for an element p! JT Cifi. 
Definition 2 (Unforgeability for public-key HLAs). Let A = (Gen, Tag, 
Auth, Vrfy) be a public-key HLA and A be an adversary, and consider the fol- 
lowing experiment: 

1. The challenger computes ( pk,sk ) «— Gen(l fe ), where pk defines p and B. 

2. Given pk and oracle access to Tag sfc (-), adversary A outputs a file f £ [ B] n . 

3. The challenger tags the file by computing ( t , st) <— Tag sfc (/). 

4- Given t and st, the adversary A outputs a challenge vector c £ Z”, an 
element p! £ Z, and a tag t' . 

5. The adversary succeeds if p! yf JT Cifi and Vrfy pfc (st, p\ c, t') = 1. 

A is unforgeable if the success probability of every ppt adversary A in the above 
experiment is negligible. 

The distinctions between the case of public verifiability (as defined above) and 
private verifiability (as defined in jS]) are that, in the former setting (1) ver- 
ification does not require the original secret key sk but only the state st and 
the original public key; (2) unforgeability holds even against an adversary who 
knows the public information pk and st. Our definition is also stronger than the 
one given in j£j in that we initially give the adversary access to a tagging oracle. 

2.2 Homomorphic Identification Protocols 

An identification protocol allows a prover V in possession of a secret key sk to 
prove its identity to a verifier V that possesses the corresponding public key pk. 
We consider 3-move identification protocols where the prover generates the first 
message a using the public key pk and randomness r; the verifier sends a random 
challenge /3; and the prover then computes a response 7 using ( pk,sk ), the 
randomness r, and the verifier’s challenge (3. Given the transcript of the protocol, 
the verifier decides whether to accept or not. 
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Definition 3 (Identification protocol). An identification protocol is a three- 
move protocol between a ppt proven V and a ppt verifier V. The protocol consists 
of four polynomial-time algorithms (Setup, Comm, Resp,Vrfy) such that: 

( pk , sk ) <— Setup(l fe ) is a probabilistic algorithm that takes as input the security 
parameter and outputs a public and private key pair (pk, sk). 
a <— Commfpfc r) is a probabilistic algorithm run by the proven V to generate 
the first message. It takes as input the public key and random coins r, and 
outputs an initial message a. We stress that there is no need for sk. 

7 Resp (pk, sk, r, /3) is a probabilistic algorithm that is run by the proven V 
to generate the third message. It takes as input the public key pk, the secret 
key sk, a random string r, and a challenge (3 (from some associated challenge 
space), and outputs a response 7 . 

b := Vrfy (pk, a, (3, 7 ) is a deterministic algorithm run by the verifier V to decide 
whether to accept the interaction. It takes as input the public key pk, an 
initial message a, a challenge (3, and a response 7 . It outputs a bit b, where 
‘T indicates acceptance and ‘0’ indicates rejection. 

For correctness, we require that for all k £ N, all (pk,sk) output by Setup(l fc ), 
all random coins r, and all (3 in the appropriate challenge space, it holds that 

Vrfy (pk, Comm(p&;; r), (3, Resp(pk, sk,r, (3)^ = 1. 

An identification protocol is homomorphic if the verification of several transcripts 
of the protocol can be “batched” : 

Definition 4 (Homomorphic identification protocol). An identification 
protocol E = (Setup, Comm, Resp, Vrfy) is homomorphic if there exist efficient 
functions Combiner, Combines such that: 

Completeness: For all (pk,sk) output by Setup(l fc ) and all c £ Z^ k , if tran- 
scripts {(ati,/3i, 7 i)}i<i< n are such that Vrfy(pfc, ai,/9i,7») = 1 for alii, then: 

Vrfy ^pk, Combiner (c, a), Cj(3j, Combines(c, 7 )^ = 1. 

Unforgeability: Consider the following experiment involving an adversary A: 

1. The challenger computes (pk, sk) <— Setup(l fe ) and gives pk to A. 

2. The following is repeated a polynomial number of times: 

— A outputs (3 ' in the challenge space. The challenger chooses ran- 
dom r, computes 7 := Resp (pk,sk,r,/3 r ), and gives (r, 7 ) to A. 

3. The adversary outputs a n-vector of challenges (3. Then for each i the 
challenger chooses ri at random, sets OLi := Comm(pfc; n) and 7 * := 
Resp (pk,sk,ri,j3i), and gives (r, 7 ) to A. 

j. A outputs a triple (c,p!,')'), where c £ The adversary succeeds if 
(1) p! ^ Yhi c iPi an d (£) Vrfy(pfc, Combiner (c, a), p! , 7 ') = 1 . 
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2.3 Proofs of Storage 


Definition 5 (Proof of storage). A (publicly-verifiable) proof of storage is a 
tuple of five ppt algorithms (Gen, Encode, Prove, Vrfy) such that: 

( pk,sk ) <— Gen(l fc ) is a probabilistic algorithm that is run by the client to set 
up the scheme. It takes as input a security parameter, and outputs a public 
and private key pair ( pk , sk). We assume pk defines a k-bit prime p and a 
positive integer B. 

{f',st ) <— Encode^ (/) is a probabilistic algorithm that is run by the client 
in order to encode the file. It takes as input the secret key sk, and a file 
f G [-B]". It outputs an encoded file f and state information st. 

7r := Pro ve(pk, f ,c) is a deterministic algorithm that takes as input the public 
key pk, an encoded file f, and a challenge cgT,™. It outputs a proof n. 
b := Vrfy (pk, st, c, 7 r): is a deterministic algorithm that takes as input the public 
key pk, the state st, a challenge c G Z™, and a proof n. It outputs a bit, 
where ‘V indicates acceptance and ‘0’ indicates rejection. 

We require that for all k G N, all ( pk,sk ) output by Gen(l fc ), all f G [B} n , all 
( f ,st ) output by Encode^/), and all c G Z”, it holds that 

Vrfy (pk , st, c, Pro ve(pfc, c)^ = 1. 

Note that the above defines a publicly-verifiable PoS since the original secret key 
sk is not needed in order to perform verification. 

Security of a PoS, roughly speaking, guarantees that if the verifier accepts 
then the prover indeed has (sufficient information to recover) the entire original 
file f. As noted in jilt 111415] , soundness can be formalized using the notion of a 
knowledge extractor E0. As in |5|, we phrase our definition using the paradigm 
of “witness-extended emulation” 021 - 

Definition 6 (Security for a publicly-verifiable PoS). Let II = (Gen, 
Encode, Prove, Vrfy) be a publicly-verifiable PoS. II is secure if there is an 
expected polynomial-time knowledge extractor 1C such that, for any ppt adver- 
sary A we have: 


1. The distributions 

( (pk, sk) <- Gen(l fc ); (f,st A ) <- A Encode ° k ('> (pk)-, 
\ (/', st) <- Encode^/); c <- 


(c, A(st A ,f,st,c)) | 


and 

( ( pk,sk ) 


Gen(l fc ); (/, st A ) <- A Encodesk( - '> (pk)] 
(f',st) <— Encode sfc (/) 


. ]cf^f'’ st >-\ p k,st) | 


are identical. (Above, K\ denotes the first output of K..) 
2. The following is negligible: 


Pr 


(pk, sk) <— Gen(l fc ); 

(f ,st A ) <— A Encodesk ^ (pk)] 
(/',sf)<- Encode^/); 
((c,tt), f*) ^ JC A ^’f'^(pk,st) 


Vrfy(pfc,at,c,7r) = l/\/* f 
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3 From Homomorphic Identification Protocols to HLAs 

We now show how to transform any homomorphic identification protocol £ = 
(Setup, Comm, Resp, Vrfy) into a public-key HLA. The basic idea is to use the file 
blocks /i , . . . , f n as the “challenges” in n parallel invocations of the identification 
protocol. Thus, a very basic PoS would be as follows: 

- The client computes ( pk,sk ) <— Gen(l fc ). 

- For each block /, of the file, the client computes a i: 7 , such that (ai, fi,~fi) 
is an accepting transcript in the underlying identification scheme. 

- The client sends to the server the file / = /1 1 ■ ■ ■ | f n and the tags 71, . . . , 7 n ; 
the client stores ai , . . . , a n as its own local state. 

To verify that the server stores the ith block of the file, the client requests the 
server to send (/*, 7 i); the client can authenticate this response by checking that 
(ai, is an accepting transcript. 

There are several drawbacks to the above approach. First, the client’s state is 
linear in the file sizc0 This is easy to remedy by having the client generate each 
ai using a pseudorandom function (if private verifiability suffices) or a random 
oracle (if public verifiability is desired, as here). A more serious problem is that 
a server can easily “cheat” without being caught “too often” by throwing away 
blocks of the file. If the server deletes, say, 1 block from the file then it is only 
caught with probability 1/n. This can be addressed, to some extent, by having 
the client request many blocks but then the communication complexity increases. 

Instead, we rely on the homomorphic property of the identification scheme to 
“batch” the authentication of multiple blocks. Specifically, the client will send 
a random integer vector c and the server will respond with pi := JTcj/j an( l 
y := Combine3(c, 7 ); This response can be verified by checking whether 

Vrfy(pfc, Combinei(c, a),//, 7') = 1 . 

(See Figured) Although the client-to-server communication is large, the server- 
to-client communication is essentially independent of the file size (cf. footnoted- 
We reduce the client-to-server communication when we construct a PoS in the 
next section. 

Theorem 1 . If £ is an unforgeable homomorphic identification protocol, then 
A as in Figure 0 is an unforgeable public-key HLA if H is modeled as a random 
oracle. 

Proof. Correctness is easy to verify, and so we consider security. Let A be a ppt 
adversary attacking A. We construct an adversary AI attacking £ as follows: 

1 . A' is given a public key pk, generates B and p in the obvious way, and runs 
A(pk,p, B). 

2 In some cases linear state may be acceptable, as long as the state is a constant 
fraction shorter than the file itself. When using certain homomorphic identification 
schemes, including the one discussed in Section d this indeed can be achieved. 
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Let E = (Setup, Comm, Resp, Vrfy) be a homomorphic identification pro- 
tocol and let H be a function. Construct a public-key HLA A = 
(Gen, Tag, Auth, Vrfy) as follows: 

- Gen(l fc ): Compute ( pk,sk ) <— T.Setup(l fc ). Let B be such that [B] is 
in the challenge space of E, and choose a fc-bit prime p. Output the 
public key ( pk,p,B ) and secret key sk. 

~ Tag sfc (/), where / = /i| • • • \f„, and fi G [B] for all i\ 

1. Choose st <— (0, l} fc . 

2. For 1 < i < n: 

a. Set ry := H(st;i) and on := X'.Comm(pk; n). 

b. Compute 7 j :ss E.Resp(pk, sk, n, fi). 

3. Output t := ( 71, ... , 7n) and st. 

- Auth p jfc(/, t, c): Compute and output r <— Z'.Combine3(c, t). 

- Vrfy pk (st,n,c,-r): 

1. for 1 < i < n, set ry := H(st\i) and := X'.Comm(pfc; n). 

2. Output LW/rfy^fc, Combinei(c, a),p, r). 


Fig. 1. Transforming a homomorphic identification protocol into a HLA 

2. When A requests Tag sfc (/) for / = then (for i = 1 to n) A' 

queries fi to its own oracle and receives in return (r*, 74). Then A! chooses 
random st e {0, l} k , sets answers to the random oracle appropriately, and 
gives (71, . . . , 7 n ) and st to A. 

3 . Eventually, A outputs a file f. Following this, A! outputs the vector of n 
challenges f = fi\--- |/n, and receives in return (r,7). Then A! chooses 
random st G {0, 1} , sctfjj answers to the random oracle appropriately, and 
gives (7, st) to A. 

4 . When A finally outputs c, //, t' , then A' outputs these same values. 

It is easy to see that A succeeds in attacking A exactly when A! succeeds in 
attacking E. 

4 From HLAs to Efficient Proofs of Storage 

In this section we show how to use any HLA to construct a PoS having com- 
munication complexity independent of the file size. Our transformation is in the 
standard model. 

It is immediate how an HLA can be used to construct a PoS with communica- 
tion complexity linear in the file size: When storing a file f, the client computes 
tags on all the file blocks and gives to the server the vector of tags t (along with 
f itself). To verify, the client chooses a random c G Z” and sends it to the server; 
the server responds with Yli c ifi an( l Auth p k(f,t,c) (which is authenticated by 

3 We assume for simplicity that no st G {0, l} k is chosen twice throughout the exper- 
iment, since this occurs with only negligible probability. 
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the client in the obvious way). If authentication tags output by Auth have length 
0(k), then the server-to-client communication for an n-block file is bounded by 



0(k) + log ( ^ afi ) < 0(k) + log n ■ p ■ B = 0(k) + log n. 


For typical values of k, n, this means that the server-to-client communication is 
(essentially) independent of the file size. 

To reduce the client-to-server communication, we use a pseudorandom func- 
tion F: the client sends a key K fj {0, l} fe , and the server then derives the 
challenge vector c by settings := F K (i) for all i. (See Figure El ) This approach 
is, perhaps, quite “natural” Jj but it turns out to be highly non-trivial to prove 
that it is sound. (This difficulty was mentioned in mo The issue is that since 
the key K is public, we cannot reduce to the security of the pseudorandom 
function in the usual way. Instead we must use a more careful analysis. 


Let A = (Gen, Tag, Auth, Vrfy) be a public-key HLA, and let F be a pseu- 
dorandom function. Construct a publicly-verifiable PoS 77 = (Gen, Encode, 
Prove, Vrfy) as follows: 

— Gen^): Compute and output {pk, sk) <— AGen(l fc ). Let p be the prime 
implicit in pk. 

— Encoder (/): Compute ( t,st ) <— A.Tag sfc (/), and output f = ( f,t ) 
and st. 

— Pro ve(pfc, K), where K £ {0, l} fc : 


1. Parse f as ( f,t ). 


2. For 1 < i < n let Cj := Fk(i), where Ci is viewed as an element 
of Zp. 

3. Compute t <— AAuth p fc(/, t, c) and p := °ifi- 

4. Output it := (p, r). 

— Vrfy(pfe, st, K, 7r): 

1. Parse tt as 

2. For 1 < i < n, let c, := F K {i). 

3. Output b := A.\/rfy pk (st, p, c, r). 


Fig. 2. Transforming an HLA into a PoS 


Theorem 2. Let A be an unforgeable public-key HLA, and let F be a pseudo- 
random function secure against non-uniform polynomial-time adversaries. Then 
LI as in Figure IB is a secure publicly-verifiable PoS. 

Proof. Correctness of the construction is easily verified, and so we turn to proving 
security. We describe a knowledge extractor /C that runs in expected polynomial- 
time and satisfies Definition 0 Recall that K. is given pk, st as input and has 

4 A similar approach, based on pseudorandom generators, was proposed in 0 in the 
context of verifiable shuffles. 
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oracle access to A(st^,f',st,-), which we abbreviate as -A(-)- Define c(K) = 
(Fk (1), • • • , Fk{ti)). The high-level structure of K. is as follows: 

1. K, chooses random K <— {0, l} fc and runs A(K) to obtain a proof n. If 
Vrfy(p/c, st, K, ir) = 0 then K, outputs (( K , 7r| s Jb) and stops. Otherwise, its 
first output will still be (K, tt) but it attempts to recover the original file as 
described next. 

2. /C repeatedly rewinds A and sends it different challenges until A responds 

correctly to a total of n challenges K n such that c{K i), . . . , c(K n ) 

are linearly independent (over Q). Given n successful responses to these n 
challenges, /C reconstructs a candidate file /, and outputs it. 

The above neglects some technical details that we now formalize. If A(K) outputs 
a proof 7r = (/z, t) for which Vrfy pfc (.st, p. c(K). r) = 1, then we say that A" is a 
good challenge. /C implements step 2, above, as follows: 

1. Initialize sets Good^ := Good c := 0. Keep track of the total number of calls 
to A, and halt execution with output fail if 2 fc calls are made. 

2. Estimate the probability p* with which a random key K is good by running 
A with a random challenge until some fixed polynomial number q = q(k) 
successful verifications occur. By appropriate choice of q, it is possible to 
ensure that the estimate p* is within a factor of 2 of the true probability 
with all but negligible probability 2 -fc . 

3. For j = 1 to n do: 

— Repeatedly sample Kj uniformly, querying A on each one, until a good 
Kj with c(Kj ) 0 span(Good c ) is found. If found, then add Kj to Good*: 
and add Cj = c(Kj) to Good c , and go to the next value of j. If no such 
Kj is found in at most k 2 /p* tries, then output fail and halt. 

4. Let Goodie = {K \, . . . , K n } and Good c = {ci, . . . , c n }, where Cj = c(Kj), 
and let nj = ( pj,Tj ) be the output of A(Kj). Set up the system of linear 
equations { JT c j.i • fi = t J -j}i<j< n i n the unknowns f = (/i, . . . , /„). Solve 
for f (over the integers) and output it. 

We refer to the above as the extraction subroutine. 

To complete the proof, we need to show three things. First, that 1C runs in 
expected polynomial time for any A. Second, that if A successfully convinces a 
verifier in the PoS protocol with sufficiently high probability, then the extraction 
procedure will successfully complete (specifically, step 3 will be successful) with 
overwhelming probability. Third, that with overwhelming probability the file f 
output by the extraction procedure is indeed equal to the true file f. The first 
and third of these items are essentially standard. The second step would be 
relatively straightforward if the challenge in the PoS protocol were a random 
vector c; what makes it more complicated is that the challenge is a PRF key K 
that is expanded to a vector c = c(K). 
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Fixing stj, , f, and st, we let p* denote the probability that a random chal- 
lenge K is good; i.e., this is the probability with which A(styi, f, st, •) responds 
correctly to the verifier’s challenge (we assume stj, includes ^4’s coins). 

Claim. K. runs in expected polynomial time. 


Proof. If p* =0 then it is clear that /C runs in expected polynomial time. 
So assume p* > 0. We must then analyze the expected running time of the 
extraction procedure, following [Mil 2j . Steps 1 and 4 take strict polynomial time. 
The expected running time of step 2 is exactly (some polynomial times) q[k)/p*. 
As for step 3, there are two cases: If p* < p*/ 2, then the only thing we can claim 
is that the running time is bounded by (some polynomial times) 2 fc , due to the 
counter being maintained in step 1. But the probability that p* < p* / 2 is at 
most 2~ k . On the other hand, if p* > p* / 2 then the expected running time of 
step 4 is at most (some polynomial times) n ■ k 2 /p* < 2 nk 2 /p*. 

K. only runs the extraction procedure with probability p*. Thus, the overall 
expected running time of K. is upper-bounded by 

p* ■ (poly(fc) + poly(fc) ■ q(k)/p* + poly(fc) ■ 2 k ■ 2~ k 2 + poly(fc) ■ 2 nk 2 /p*^j , 

which is polynomial. 

Claim. There exists a negligible function e(-) such that if p* > e(k) then the 
probability (conditioned on the extraction procedure being run) that the extrac- 
tion procedure outputs fail is negligible. 


Observe this implies that 


Pr 


(pk,sk) <— Gen(l fc ); 
if, sty 0 <- A Encode8fe (-)(p/c); 

( f,st ) <- Encode„fc(/); 

((c, 7 r),/*) <— K, A( - stA ’f'’ st ’'\pk, st) 


Vrfy(pfc, st, c, 7r) = 1 f\ f* = fail 


is negligible. 

Proof. We view the Cj = c(Kj) as vectors over Z p , and use the fact that integer 
vectors ci, . . . , Q, with entries in the range {0, . . . , p— 1}, are linearly dependent 
over Q only if they are linearly dependent over Z p ; thus, an upper bound on the 
probability of the latter implies an upper bound on the probability of the former. 
Define 

e'{k) = max L {Pr[AT +- {0, l} k : c{K) € L}}, 

where the maximum is taken over all ( n — l)-dimensional subspaces L c Z™. It 
is not hard to see that if F is a non- uniformly secure PRF then e'{k) — 1/p is 
negligible. Since 1/p is negligible, we see that e' is negligible too. Take e = 2e' . 
We show that if p* > e then, conditioned on the extraction procedure being run, 
the probability that it outputs fail is negligible. 
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First, observe that the probability that /C times out by virtue of running 
for 2 k steps is negligible (this follows from the fact that the expected running 
time of K. is polynomial). Next, fix any j and consider step 3. The number of 
challenges that are good is exactly p* ■ 2 k , and the number of challenges Kj for 
which c(Kj) lies in span(Good c ) (which has dimension at most n — 1) is at most 
e' ■ 2 k < p* ■ 2 k /2. Thus, the probability that a random Kj is both good and does 
not lie in span(Good c ) is at least p* / 2. If p* is within a factor of 2 of p* , which 
occurs with all but negligible probability, then K. finds such a Kj within k 2 /p* 
steps with all but negligible probability; a union bound over all values of j G [n] 
then shows that it fails in some iteration with only negligible probability. This 
completes the proof. 

Finally, we show that the probability that the extraction procedure outputs an 
incorrect file is negligible. In conjunction with the previous claims, this completes 
the proof that K. satisfies Definition El 

Claim. For any PPT adversary A, the following is negligible: 

{pk, sk) <— Gen(l fe ); 

p (/, sty 0 <- yl Encodes '‘(')(pA;); _ Vrfy(pfc, st, c, n) = 1 

if, st) <- Encode^ (/); * A 

((c, 7 r), /*) <— ]C A ( stA ’f' {pk, st) 

Proof. The event in question can only occur if, at the end of the extraction 
procedure, there exists c G Good c , with c = c(K), for which A(K) outputs (p, r) 
such that Vrfy(pfc, st, K, (p, r)) = 1 yet p c ifi- But this exactly means 

that A has violated the assumed unforgeability of A. Since K. runs in expected 
polynomial-time, it follows by a standard argument that this occurs with only 
negligible probability. 

This concludes the proof of Theorem El 

5 A Concrete Instantiation Based on Factoring 

In this section we describe a homomorphic variant of the identification protocol 
of Shoup ESI. whose security is based on the hardness of factoring. Together with 
the transformations described in the previous sections, this yields a factoring- 
based PoS in the random oracle model. 

Protocol T’shoupj described in Figure El relies on a Blum modulus generator 
GeriBium that takes as input a security parameter l k and outputs a tuple (N,p, q) 
such that N = p ■ q where p and q are fc-bit primes with p = q = 3 mod 4. We 
denote by QIZn the set of quadratic residues modulo N, and by J ^ 1 the elements 
of Z* N with Jacobi symbol +1. We use the following standard facts regarding 
Blum integers: (1) given xeZ* N it can be efficiently decided whether x G 
(2) if x G , then exactly one of x or —x is in QIZn’, (3) every x G QR-n has 
four square roots, exactly one of which is itself in QIZn ■ 
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Define homomorphic identification protocol -Eshoup as follows: 

— Setup(l fe ): Generate ( N,p,q ) <— GeriBium(l fc )- Choose y <— QTZn, and 
output pk := ( N , y) and sk := ( p , q). 

- Comm(pfc; r): View r as an element of and output a := r. 

- Resp (pk, sk, r, /?): Let (3 G Z 2fe (which defines the challenge space). Out- 
put 7 , a random 2 3fc th root of ±r • y l3 mod N (where the sign is chosen 
to ensure that a square root exists) . 

— Vrfy {pk, a, /3, 7 ): Output 1 iff j 2,3 * = ±a • y & mod N and (3 < 2 3fc . 

Combinei and Combines are defined as follows: 

— Let c 6 Z 2 fe and a € Zjy-. Then Combinei (c, a) = Il"=i a T m °d N. 

— Let c 6 Z 2fc and 7 6 13%. Then Combine3(c, 7 ) = JIILi iT m °d N . 

Fig. 3. A homomorphic identification protocol based on factoring 

Correctness of i^shoup as a stand-alone identification protocol is immediate. 
Let us verify that it is homomorphic. Fix public key (TV, y ) , challenge vector 
c e , and {(07, /?», 7i)}i<i< n such that 7 f = ±a l • y&* mod N for all i. Then 



± JJ af ■ yP iCi mod N 
±Combinei(c, a) ■ yEi°*P* mod N, 


and furthermore < n • 2 fe • 2 fe < 2 3fe . 

Theorem 3 . Eshoup is an unforgeable homomorphic identification protocol if the 
factoring assumption holds with respect to Gen slum- 

Proof. The high-level ideas are similar to those in ca. though the proof here 
is a bit simpler. Given a ppt adversary A attacking Tsh 0U p, we construct a ppt 
algorithm B computing square roots modulo N output by Geneium- This implies 
factorization of N in the standard way. Algorithm B works as follows: 

— B is given a Blum modulus N and a random y e QTZn. It runs A on the 
public key pk = (N, y). 

- When A outputs (3' G Z 2 (=, then B chooses random 7 G Z N and b G { 0 , 1 }, 
and sets r := a := (— l) b • j 2 jy 13 mod N. It then gives (r, 7) to A. 
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— When A outputs an n-vector of challenges /3, then for each i algorithm B 
computes (rj,7i) as in the previous step. It gives (r, 7 ) to A. 

— If A outputs (c, //, 7') with Vrfy(pfc, Combiner (c, a),//, 7') = 1 but y' ^ 

CiPi, then B computes a square root of y as described below. 

Note that the simulation provided for A by B is perfect, and so A succeeds in 
the above with the same probability with which it succeeds in attacking the 
real-world protocol i^shoup- 

To complete the proof, we describe the final step in more detail. Define 
a* = Combinei(c, a), 7* = Combine3(c, 7 ), y = Cjpj. 


If Vrfy(pfc, a*,y', 7') = 1 but y! ± y, then (q') 2 = ±a* ■ y^' mod N; further- 
more, B also knows that (7*) 2 = ±0* • y >l mod N . Assume without loss of 

generality that y > y' . Since y e QRn this implies 

{l' h*) 2 = mod N (1) 

with y, y' < 2 3k (and so y — y! < 2 3fe ). Write y — y' = f ■ 2* for t < 3k and / 
odd. Since squaring is a permutation of QIZn, Equation (JTJ) implies 

(7V7*) 2 = V* moc i A^- 

Using the extended Euclidean algorithm, B computes integers A, B such that 
Af + B2 3k ~ f = 1. Then 

((W^V) 2 " * =([ih*) A y B f k * =y Af y B23k ~ t = y, 

and so B can compute a square root of y. Since B computes a square root 
whenever A succeeds, the success probability of A must be negligible. 


Acknowledgments. We are grateful to Gene Tsudik for his insightful com- 
ments and contributions during the early stages of this work. 
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Abstract. Adaptive oblivious transfer (OT) is a two-party protocol 
which simulates an ideal world such that the sender sends Ml , ■ • ■ , M„ 
to the trusted third party (TTP), and the receiver receives M ai from 
TTP adaptively for i = 1,2,-- k. This paper shows the first pairing-free 
fully simulatable adaptive OT. It is also the first fully simulatable scheme 
which does not rely on dynamic assumptions. Indeed our scheme holds 
under the DDH assumption. 

Keywords: Adaptive OT, Fully Simulatable, DDH, Standard Model. 


1 Introduction 

In a non-adaptive ( k,n ) oblivious transfer (OT) scheme which is denoted by 
ot;: jfilU14j . a sender has n secret strings Mi, • • • , M n , and a receiver has k 
secret choice indices <ji, • • • , £ {1, • • • ,n}. At the end of the protocol, the re- 

ceiver learns M ai , ■ ■ ■ , M„ k (only), and the sender learns nothing on o \ , • • • , ay-. 
Efficient OT schemes are important because OTf is a key building block for 
secure multi-party computation |2UI7ll2j . 

In an adaptive ( k,n ) oblivious transfer protocol which is denoted by OT}? xl , 
the receiver chooses cr,; adaptively depending on M ai , ■ ■ ■ , M ai _ -i G3 In other 
words, OTj? xl is a two-party protocol (S, R) which simulates an ideal world 
protocol (S', R!) such that 

1. the sender S' sends Mi, ■ ■ • , M n to the trusted third party (TTP), and 

2. the receiver R' receives M„ t from TTP adaptively for i = 1, 2, • • • k, where 
the receiver chooses a* based on M ai , • • • , M ai _ 1 . 

Adaptive OT has wide applications such as oblivious database searches, secure 
multiparty computation and etc, too. 

As a security notion of OT (for both non-adaptive and adaptive), half simu- 
latability was considered until recently j 151 1 fill III 8| . This definition requires 

— (Sender’s privacy.) For any receiver R in the real world, there exists a receiver 
R in the ideal world such that the outputs of R and R are indistinguishable. 

— (Receiver’s privacy.) For any input to the receiver, the view of the sender 
must be indistinguishable. (Note that the honest sender outputs nothing.) 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 334^346] 2009. 
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However, Naor and Pinkas noticed that there can be a practical attack on a half 
simulatable adaptive OT ESI- 

To solve this problem, Camenisch, Neven and shelat formalized a notion of 
full simulatability [2J- In this definition, we consider a pair of outputs of the 
sender and the receiver. Although the honest sender outputs nothing, a malicious 
sender may output its view in the execution of the protocol. Full simulatability 
now requires that 

— (Sender’s privacy) For any receiver R in the real world, there exists a re- 
ceiver R' in the ideal world such that {S' out , R' ouf ) is indistinguishable from 
(Sout, Rout), where A out denotes the output of A. 

- (Receiver’s privacy) For any sender S in the real world, there exists a sender 
S' in the ideal world such that (S' out . R' out ) is indistinguishable from (S' out , 
Rout). 

They then showed a fully simulatable adaptive OT in the random oracle model, 
and one in the standard model, respectively j2) • 

We focus on the standard model in this paper Q Then all fully simulatable 
adaptive OT known so far have been constructed based on pairing, and they 
rely on dynamic assumptions such as 5-strong DH assumption. For example, 
Camenisch et al.’s OTjf xl relies on 5-strong DH assumption and 5-PDDH as- 
sumption. Green and Hohenberger’s OTJf xl relies on 5-hidden LRSW assump- 
tion 0 . (This scheme achieves UC security.) Jarecki and Liu’s OTJf xl relies on 
the decisional 5-DHI assumption m- 

This paper shows the first pairing-free fully simulatable adaptive OT. It is 
also the first fully simulatable scheme which does not rely on dynamic assump- 
tions. Indeed our scheme holds under the DDH assumption. While the previous 
schemes use a signature scheme as a building block 0 our scheme utilizes ElGamal 
encryption scheme. (Hence we do not need a pairing.) 

Our scheme is conceptually very simple and efficient. The initialization phase 
and each transfer phase are constant round protocols. Thus the total round 
complexity is proportional to k. 

Finally we extend our scheme to a fully simulatable non-adaptive OT which 
requires constant rounds. Green and Hohenberger showed a fully simulatable 
non-adaptive OT(j based on pairing under the decisional BDH assumption 0 . 
On the other hand, our OT£ is pairing-free and relies on the DDH assumption. 

Lindell showed a fully simulatable OTf under DDH, Paillier’s decisional iVth 
residuosity, and quadratic residuosity assumptions as well as under the assump- 
tion that homomorphic encryption exists m- (He claimed that they can be 
extended to OTff.) Under the DDH assumption, our OTf is more efficient than 
the Lindell’s scheme ra 

1 In the random oracle model, Ogata and Kurosawa showed an adaptive OT based on 
Chaum’s blind signature scheme [liSj . Camenisch, Neven and shelat [2| proved that 
it is fully simulatable as well as they corrected a flaw of • Green and Hohenberger 
showed a scheme under the decisional BDH assumption |Hj . 

2 Maybe because an adaptive OT shown by Ogata and Kurosawa ^S| utilizes Chaum’s 
blind signature scheme. 


336 K. Kurosawa and R. Nojii 


Table 1. Fully simulatable Adaptive OT without RO 


scheme 

pairing 

dynamic assumption 

assumption 

Camenisch et al. (2 

yes 

yes 

g-strong DH and g-PDDH 

Green and Hohenberger |2| 

yes 

yes 

g-hidden LRSW (UC secure) 

Jarecki and Liu [UJ 

yes 

yes 

g-DHI 

Proposed 

no 

no 

DDH 


2 Preliminaries 

2.1 Notations 

In this paper, we denote a security parameter by r € N. All the algorithms take 
r as the first input and run in (expected) polynomial-time in r. We denote prob- 
abilistic polynomial-time by PPT for short. We often do not write the security 
parameter explicitly. 

2.2 Proof Systems 

To design our scheme, we use several proof systems. We follow the definitions 
described in |4l5l2ij . 

Let R = {(a, ft)} C {0, 1}* x {0, 1}* be a binary relation R such that \ft\ < 
poly(a) for all (a, ft) G R, where poly is some polynomial. We only consider the 
relation R such that (a, ft) G R. can be decided in polynomial in |a| for all (a, ft). 
We define L R = {a | 3ft such that ( a, ft ) G R}. 

Proof of Membership (PoM): A pair of interacting algorithms (P, V), called 
a prover and a verifier, is a proof of membership (PoM) for a relation R if the 
completeness and soundness are satisfied. Here, we say that (P, V) satisfies the 
completeness if for all ( a , (3) € R, the probability of V(a) accepting a conversa- 
tion with P (a, ft) is 1. Also we say that (P, V) satisfies the soundness if for all 
a $ L r and all P*(a) (including cheating provers), the probability of V(a) ac- 
cepting the conversation with P* is negligible in |a|. We say that this probability 
as soundness error of the proof system. 

Proof of Knowledge (PoK): We say a pair of interacting algorithms (P, V) is 
PoK for a relation R with knowledge error k G [0, 1] if it satisfies completeness 
described above and has an expected polynomial-time algorithm, called knowledge 
extractor, E. Here, the algorithm E is a knowledge extractor for a relation R if 
possibly cheating P has probability e of convincing V to accept a, then E, when 
given black-box access to P, outputs a witness ft for a with probability e — n. 

Witness Indistinguishability (WI): A proof system (P, V) is perfect WI if 
for every (a, /3i), (a, ft 2) G R, and any ppt cheating verifier, the output of V(a) 
(including cheating verifier) after interacting with P(/3i) and that of V(a) after 
interacting with P(ftf) are identically distributed. 
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Zero Knowledge (ZK): We say that a proof system (P, V) is perfect ZK if there 
exists an expected polynomial-time algorithm Sim, called a simulator, jiuch. that 
for any PPT cheating verifier V and any (a, fJ) € R, the outputs of V(e*) after 
interacting with P(/3) and that of Sim v ^ a ^(a) are identically distributed. 

3 fc-Out-of-n Oblivious Transfer 

In this section, we present a UC-like definition of fully simulatable non- adaptive 
OT. Similarly, we present a UC-like definition of fully simulatable adaptive OT. 
We consider a weak model of UC framework as follows. 

— At the beginning of the game, an adversary A can corrupt either a sender S 
or a receiver R, but not both. 

— A can send a message (which will be denoted by A out ) to an environment 
Z after the end of the protocol. (A cannot communicate with Z during the 
protocol execution.) 

The ideal functionalities of OT]) and OT]) xl will be shown below. For a protcol 
7r = (S, R), define kd.v(Z) as 

Adv(iJ) = | Pr [Z = 1 in the real world) — Pr {Z = 1 in the ideal world) | 

3.1 Non-adaptive fc-Out-of-n Oblivious Transfer 

In the ideal world of OT]), the ideal functionality F„ on , an ideal world adversary 
A' and an environment Z behave as follows. 

(Initialization phase:) 

1. An environment Z sends (Mi, • • • , M n ) to the dummy sender S'. 

2. S' sends (Mf, ■ ■ • , M*) to T non , where (Aff , ■ ■ ■ , Af*) = (Af x , ■ ■ ■ , Af„) if S' 
is not corrupted. 

(Transfer phase:) 

1. Z sends (<7i, • • • , er&) to the dummy receiver R', where 1 < o t < n. 

2. R' sends (erf , • • • , erf ) to T non , where (erf , • • • , erf) = (eri , ■ ■ ■ , ay-) if R' is not 
corrupted. 

3- Tnon sends received to an ideal process adversary A'. 

4. A' sends b = 1 or 0 to T non , where b — 1 if S' is not corrupted. 

5- T non sends Y to R', where 



6. R' sends Y to Z. 


After the end of the protocol, A' sends a message A' out to Z. Finally Z outputs 
1 or 0. 
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In the real world, a protocol (S, R) is executed without T non . where the 
environment Z and a real world adversary A behave in the same way as above. 

Definition 1. We say that (S, R) is secure against the sender (receiver) corrup- 
tion if for any real world adversary A who corrupts the sender S (the receiver 
R ), there exists an ideal world adversary A' who corrupts the dummy sender S' 
(the dummy receiver R 7 ,) such that for any environment Z, hdv(Z) is negligible. 

Definition 2. We say that (S, R) is a fully simulatable OT) : if it is secure 
against the sender corruption and the receiver corruption. 


3.2 Adaptive fe-Out-of-n Oblivious Transfer 

In the ideal world of OT£ xl , the ideal functionality F adopt . , an ideal world ad- 
versary A' and an environment Z behave as follows. 

(Initialization phase:) 

1. An environment Z sends (Mi, • • • , M n ) to the dummy sender S'. 

2. S' sends (M*, • • • , M*) to T adapt, where (M* , • • • , M*) = (Mi, • • • , M n ) if 
S' is not corrupted. 

(Transfer phase:) For i = 1, • • • ,k, 


1. Z sends cr, to the dummy receiver R 7 , where 1 < cr* < n. 

2. R 7 sends o* to IFadapt, where a* = cr,; if R 7 is not corrupted. 

3- Tadapt sends received to an ideal process adversary A'. 

4. A' sends b = 1 or 0 to T a dapu where b = 1 if S' is not corrupted. 
5- T adapt sends Y % to R 7 , where 


Yi = 


if b = 1 
if b= 0 


6. R 7 sends Yi to 2. 


After the end of the protocol, A' sends a message A’ out to Z. Finally Z outputs 
1 or 0. 

In the real world, a protocol (S, R) is executed without T a dapt, where the 
environment Z and a real world adversary A behave in the same way as above. 

Definition 3. We say that (S, R) is secure against the sender (receiver) corrup- 
tion if for any real world adversary A who corrupts the sender S (the receiver 
R ), there exists an ideal world adversary A' who corrupts the dummy sender S 7 
(the dummy receiver R 7 ,) such that for any environment Z, Adv(^) is negligible. 

Definition 4. We say that (S, R) is a fully simulatable OTj) xl if it is secure 
against the sender corruption and the receiver corruption. 
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3.3 Remarks 

Our definition of fully simulatable adaptive OT is weaker than the UC security 
because our adversaries A cannot communicate with Z during the protocol exe- 
cution. On the other hand, it is stronger than that of j2| which is not UC-like. In 
our definition, Z chooses cr,;. Hence er,; can depend on all of (Mi, • • • , M„). In the 
definition of |2j, receiver chooses cr,. Hence cr, can depend on • • • ,M ai _ 1 ) 

only. 


4 Our Fully Simulatable Adaptive OT 

In this section, we show an adaptive OT^ xl based on ElGamal encryption 
scheme, and prove its full simulatability under the DDH assumption. 

Let G be a multiplicative group of prime order q. Then the DDH assumption 
states that, for every PPT distinguisher D, 

cddh(D) = |Pr(D (g,g a ,g p ,g af} ) = 1) - Pr(D (g,g a ,g l3 ,g J ) = 1)| 

is negligible, where the probability is taken over the random bits of D, the random 
choice of the generator g, and the random choice of ck, /?, 7 eZ,. We denote 

£ddh = max{e D DH(D)}, 

where the maximum is taken over all PPT distinguishers D. 

The initialization phase and each transfer phase are constant round protocols. 
Hence the total round complexity is proportional to k. 

Initialization Phase 

1. The sender chooses G, g and (xi , • ■ ■ , x n , r) £ (Z g ) n+1 randomly, and com- 
putes h = g r . 

2. For i = 1, • • • , n, the sender computes 

C i = (A i ,B i ) = (g Xi ,M i -h Xi ), 

where Mi , • • • , M n e G. 

3. The sender sends (G, h,Ci,- ■ ■ , C n ). 

4. The sender proves by ZK-PoK that he knows r. 

The protocol stops if the receiver rejects. 

The jth Transfer Phase 

1. The receiver chooses a choice index 1 < <Jj <n based on M ai , • • • , 

2. The receiver chooses «6Z, randomly and computes U = (A rrj )“. 

He then sends U. 

3. The receiver proves in WI-PoK that he knows u such that 

U = A'i V • • • V U = A“. 

The protocol stops if the sender rejects. 
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4. The sender computes V = U r and sends V. 

5. The sender proves that ( g , h, U, V) in ZK-PoM that it is a DDH-tuple. 

The protocol stops if the receiver rejects. 

6. The receiver obtains M aj by computing B a . /I/ 1 - 7 '". 

Three ZK or WI proof systems in the scheme are constructed efficiently as 
follows. 

— An efficient 4-round ZK-PoK exists which can be used in the initialization 
phase. It is obtained by applying the technique of 0 to Schnorr’s identifica- 
tion scheme [Tflj . 

— An efficient 3-round WI-PoK exists which can be used in the transfer phase. 
It is implemented by applying the or-composition technique jF)j to |T3 . 

— An efficient 4-round ZK-PoM exists which can be used in the transfer phase. 
It comes from the confirmation protocol of Chaum’s undeniable signature 
scheme (which is a ZK-PoM for the DDH-tuple 0). 

Theorem 1. The above protocol is a fully- simulatable adaptive OT^ xl under 
the DDH assumption. 

The proof is given in Section El 

5 Extension to Fully Simulatable Non-adaptive OT 

In this section, we extend our adaptive OT to a fully simulatable non-adaptive 
OT which requires constant rounds. 

5.1 How to Prove Many DDH- Tuples 

We show a 4-round ZK-PoM which proves that (g. h, U\, Vj ) , • ■ ■ , ( g , h, Uk, Vfc) 
are all DDH-tuples. 

1. The receiver sends random (on, • • ■ , a*,). 

2. The sender proves that ( g , h. V-i * ) is a DDH-tuple by using 

the confirmation protocol of 0 ■ 

The confirmation protocol of j3j is a 4-round ZK-PoM on a DDH-tuple. Hence 
the above protocol runs in 4-round. (Step 1 and the 1st round of the confirmation 
protocol are merged.) 

Lemma 1. Suppose that some ( g , h, Ui, V.) is not a DDH-tuples. Then 
(9,h,u.tiU“\nhvn is a DDH-tuples with negligible probability. 

Proof. Assume that Ui = g Xi and Vi = h Vi for i = 1 , ■ ■ ■ , k. Then 

flut=g^— 

n V^ = h£*=' aiVi 
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Suppose that ( g , h, U\ t V\) is not a DDH-tuples. That is, x\ ^ y\. Then for any 
values of 02) • • • > a k, there exists a unique ai such that 

k 

y; a,i(xi —yi) = 0 mod q. (1) 


Hence the numbers of (oi,--- , < 2 / c ) which satisfies eq.JIJ is equal to g fc_1 . 
Therefore 

Pr(eq.([[J holds) = q k_1 /q k = 1/q. 

This means that (g, h, rii=i U?*, rfi=i Vf*) a DDH-tuples with negligible 
probability. □ 

Theorem 2. The above protocol is a ZK-PoM on many DDH-tuples. 

Proof. The completeness is clear. The zero-knowledgeness follows from that of 
the confirmation protocol of j^. The soundness follows from Lemma [I] and that 
of the confirmation protocol of 0 ■ □ 

5.2 Constant Round OT£ 

In this section, we modify our 0T^ xl to obtain a constant round 0T£ as follows. 

— At step 4 of the initialization phase, the sender sends (G, h,A\, - , A n ). 

— At the end of the transfer phase, the sender sends (Hi, • • • , B n ). 

— In the transfer phase, run step 3 in parallel (still it is a WI protocol). 

At step 5, the sender proves that ( g , h, U\, V\), ■ ■ ■ , (g, h, Uk, Vk) are all DDH- 
tuples by using the ZK-PoM of Sec 15. II 

Theorem 3. The proposed OT^ is a constant round fully-simulatable OTJJ un- 
der the DDH assumption. 

The proof is similar to that of Theorem 0 

6 Proof of Theorem [U 

We first prove that the proposed scheme is secure against sender corruption. We 
next prove that it is secure against receiver corruption. 

6.1 Security against Sender Corruption 

Lemma 2. The proposed scheme is secure against sender corruption. 

Proof. For every real-world adversary A who corrupts the sender, we construct 
an ideal-world adversary A' such that Adv(Z) is negligible. 

We will consider a sequence of games Gameo, Game-| . • • • , Game 4 , where Gameo is 
the real world experiment of Seed and and Game 4 is the ideal world experiment, 
respectively. Let 


Pr(GAMEj) = Pr (Z = 1 in Game,). 
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Game 0 : This is the real world experiment such that the sender is controlled by 
an adversary A. Hence 

Pr(GAMEo) = Pr(iT = 1 in the real world). 

Gamei: This is the same as the previous game except for the following. In the 
initialization phase, if the receiver accepts the ZK-PoK, then he extracts r from 
A by running the knowledge extractor Ei which is allowed to rewind A. This 
game outputs T if the extractor Ei fails in extracting r. Unless this happens, 
these two games are identical. Therefore, 

|Pr(GAME 0 ) - Pr(GAMEi)] < Ki, 
where be the knowledge error of the extractor. 

Game 2 : This is the same as the previous game except for the following. In each 
transfer phase, if the receiver accepts the ZK-PoM which proves that ( g , h. U. V ) 
is a DDH-tuple, then he obtains M ai by computing B ai /A r a . . These two games 
are identical unless the above M rJi is different from B aj /V [ / ' u . This happens if 
the receiver accepts the ZK-PoM even though (g, h, U, V) is not a DDH-tuple. 
Hence 


|Pr(GAMEi) -Pr(GAME 2 )| < kn 3 , 
where k 3 is the soundness error probability of ZK-PoM. 


Game 3 : This is the same as the previous game except for the following. In each 
transfer phase, the receiver computes U as U = (The receiver can still obtain 
M g% as can be seen from Game 2 .) Since our WI-PoK is perfect, 

Pr(GAME 2 ) = Pr(GAME 3 ). 

Game^ This game is the ideal world experiment in which an ideal- world adversary 
A' plays the role of the receiver of Game 3 and uses A as a blackbox. A' can do 
this because the receiver does not use eri, • • • . in Game 3 . 

Finally A' outputs what A outputs. It is easy to see that Game 3 and Game 4 are 
identical from a view point of Z. Hence 

Pr(GAME 3 ) = Pr(GAME 4 ). 

Further 

Pr(GAME 4 ) = Pr {Z = 1 in the ideal world). 

Now, we can summarize this lemma as follows: 


Adv(iJ) = |Pr(GAME 4 ) — Pr(GAME 0 )| 

< ^2 |Pr(GAMEj + i) - Pr(GAMEj)| 
i=0 

< Ki + kn 3 . □ 


Simple Adaptive Oblivious Transfer without Random Oracle 343 


6.2 Security against Receiver Corruption 

Lemma 3. The proposed scheme is secure against receiver corruption under the 
DDH assumption. 

Proof. For every real-world adversary A who corrupts the receiver, we construct 
an ideal-world adversary A' such that hdv(Z) is negligible. 

We will consider a sequence of games Game 0 , Gamei, • • • , Game 5 , where Gameo 
is the real world experiment of Seed and Game 5 is the ideal world experiment. 

Game 0 : This is the real world experiment such that the receiver is controlled by 
an adversary A. Hence 

Pr (Gameo) = Pr(>Z = 1 in the real world). 

Gamei: This is the same as the previous game except for the following. In each 
transfer phase, instead of running the ZK-PoM which proves that ( g , h. U. V ) 
is a DDH-tuple, the sender runs the zero-knowledge simulator of the ZK-PoM 
which is allowed to rewind A. Since the ZK-PoM is perfect ZK, we have 

Pr(GAMEi) = Pr (Game 0 ). 

Game 2 : This is the same as the previous game except for the following. In each 
transfer phase, if the sender accepts the WI-PoK, then she extracts u from A 
by running the knowledge extractor E 2 which is allowed to rewind A. This game 
outputs T if the extractor E 2 fails in extracting u. Unless this happens, these 
two games are identical. Therefore, 

|Pr(GAME 2 ) -Pr(GAMEi)| < kn 2 , 

where k 2 is the knowledge error of the extractor. 

Game 3 : This is the same as the previous game except for that the sender computes 
V as V = (B a /M a ) u instead of V = U r . It is clear that there is no essential 
difference between two games. Therefore, 

Pr(GAME 3 ) = Pr(GAME 2 ). 

Game^ This is the same as the previous game except for that the sender uses 
a random M' to compute each in the initialization phase. The difference 
|Pr(GAME4) — Pr(GAME 3 )| is still negligible by the semantic security of the 
ElGamal cryptosystem which is implied by the DDH assumption. 

Claim. If the DDH problem is hard then |Pr(GAME4) — Pr(GAME 3 )| is negligi- 
ble. More concretely, 


|Pr(GAME 4 ) - Pr(GAME 3 )| < cddh- (2) 


The proof of this claim is given later. 
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Game 5 : This game is the ideal world experiment in which an ideal- world adversary 
A 7 plays the role of the sender of Game 4 , and uses A as a blackbox. A 7 can do this 
because the sender does not use Mi, • • • , M n in Game 4 . 

Finally A 7 outputs what A outputs. It is easy to see that Game 4 and Game 5 are 
identical from a view point of Z. Hence 

Pr(GAME 4 ) = Pr(GAME 5 ). 


Further 

Pr(GAMEs) = Pr(Z = 1 in the ideal world). 

Now, we can summarize this lemma as follows: 

Adv(iT) = |Pr(GAME 5 ) - Pr(GAME 0 )| 

< |Pr(GAME i+ i) - Pr(GAMEj)| 
i=0 

< kl t2 + CDDH- □ 

To complete the proof, we must provide the proof of the claim. To do so, we 
need the following lemma0 which can be thought of as an “extended” version of 
the DDH assumption. 

Lemma 4 (Lemma 4.2 in jiff). If there exists a probabilistic algorithm D 
with running time t such that 


Pr(DOMf,fl-V.. ,g Xn ,g rxi ,--- ,g rXn ) = i) 

-Pr(D {g,g r ,g Xl ,--- ,g Xn ,g Zl ,--- ,g Zn ) = l) > e 

where the probability is taken over the random bits of D, the random choice of 
the generator g in G, and the random choice of x i, • • • , x n , r, zi, ■ ■ ■ ,z n £ Z q , 
then there exists a probabilistic algorithm with running time n ■ poly(r) + 1 that 
breaks the DDH assumption with probability > e with some polynomial poly. 

We now show a proof of the claim. 

Proof (of the claim). Let Game 7 ., (Game 4 ) be the same as Game 3 (Game 4 ) except 
for the following. In the initialization phase, instead of running the ZK-PoK in 
which the sender proves that he knows r, the sender runs the zero-knowledge 
simulator of the ZK-PoK which is allowed to rewind A. Since the ZK-PoK is 
perfect ZK, it holds that 


Pr(GAME 7 3 ) = Pr(GAME 3 ), 

Pr(GAME 4 ) =Pr(GAME 4 ). 

3 Naor and Reingold proved it by using the random reducibility of the DDH-tuple. 
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We now construct a DDH distinguisher D in the sense of Lemma El The input to 
D is (g,h,g xi . ■ ■ ■ ,g Xn ,yi,-" ,Vn), where yi = g rXi or g Zi , Our D simulates Z, 
A and the sender of Game^ or Game^ faithfully except for that in the initialization 
phase, D simulates the sender by using ( g , h, g Xl , ■ ■ ■ , g Xn ) , and h t = y,; for each 
i. Finally D outputs 1 iff Z outputs 1. 

It is easy to see that D simulates Gameg if yi = g rXi for each i, and Game^ 
otherwise. Therefore 


|Pr(GAME4) -Pr(GAMEg)| < cddh- (3) 

Hence eq.()2| holds. □ 

7 Fully Simulatable OT 2 

We have constructed a fully-simulatable adaptive OT under the DDH assump- 
tion in the standard model. It is clear that we can obtain a fully-simulatable 
(l,2)-OT (OT 2 ) as a special case. 

On the other hand, Lindell showed a fully simulatable OT f under DDH, Pail- 
lier’s decisional IVth residuosity, and quadratic residuosity assumptions as well 
as under the assumption that homomorphic encryption exists in the standard 
model [T31 . 

Let’s compare our scheme with Lindell’s OT 2 which is based on the DDH 
assumption. His scheme builds on the OT 2 of jUll and uses a cut-and-choose 
technique. The computational cost and the communication cost are 0(1) times 
larger than those of our first scheme to achieve 

Ad v(Z) < 2~ e+2 . 

Hence our scheme is more efficient. 
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Abstract. An r-collision for a function is a set of r distinct inputs with 
identical outputs. Actually finding r-collisions for a random map over a 
finite set of cardinality A requires at least about N < - r ~ 1 ^ r units of time 
on a sequential machine. For r= 2, memoryless and well-parallelizable 
algorithms are known. The current paper describes memory-efficient and 
parallelizable algorithms for r > 3. The main results are: (1) A sequential 
algorithm for 3-collisions, roughly using memory A“ and time A 1_ “ 
for a < 1/3. In particular, given N 1 ^ 3 units of storage, one can find 
3-collisions in time IV 2 / 3 . (2) A parallelization of this algorithm using 
A 1 / 3 processors running in time A 1 / 3 , where each single processor only 
needs a constant amount of memory. (3) A generalisation of this second 
approach to r-collisions for r > 3: given A s parallel processors, with 
s < (r — 2)/r, one can generate r-collisions roughly in time 
using memory 0 n every processor. 

Keywords: multicollision, random map, memory-efficient, parallel im- 
plementation, cryptanalysis. 


1 Introduction 

The problem of finding collisions and multicollisions in random mappings is 
of significant interest for cryptography, and mainly for cryptanalysis. It is well 
known that finding an r-collision for a random map over a finite set of cardinality 
N require^ more than N^ r ~ 1 ^ r map evaluations. 

Multicollisions for hash functions. If the map under consideration is a hash func- 
tion, or has been derived from a hash function, many researchers consider faster 
multicollisions as a certificational hash function weakness. Accordingly, it was 
worrying for the research community to learn that multicollisions could be found 
much faster for a widely used class of hash functions: iterated hash functions 0 • 
For n-bit hash functions from this class, one can generate 2 fc -collisions in time 

1 An r-collision is a set of r different inputs xi,...,x r which all generate the same 

output map(a:i) = • ■ • = map(.x r ). For an r-collision, one needs to evaluate the map 
(r!) 1,/r • j\d r_1 " r times j 22|. For small r, we can approximate this by 0(N^ r ^ 1 ^ r ). 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 347- |363j 2009. 
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k ■ 2"/ 2 , rather than the expected time 2 n( ‘ 2 -1 )/ 2 . The basic observation is 
straightforward: given a sequence of k consecutive 2-collisions, it is possible with 
iterated hash functions to consider the 2 k different messages obtained by taking 
all possible choices of message block for each collision and obtain 2 k times the 
same output. These iterated multicollisions have been generalized later, to more 
complex types of iterated hash functions, for example, see |fill .‘117) . It was also re- 
marked that these iterated multicollisions were a rediscovery and generalization 
of an older attack of Coppersmith 0 . 

In particular, this type of multicollisions allowed a surprising attack on hash 
cascades, i.e., hash functions H, which are the concatenation of two hash func- 
tions Gi and G 2 , i.e., H{ X) := {G 1 (X),G 2 (X)). If, say, Gi is an iterated 
hash function and vulnerable to the multicollision attack, and G 2 is any n-bit 
hash function, the adversary just needs to generate a 2"/ 2 -multicollision for Gi. 
Thanks to the birthday paradox, among the 2"/ 2 messages colliding for Gi, one 
expects to find a pair of messages colliding for G 2 with constant probability. As 
a consequence, a collision for the 2n-bit hash function H can be obtained with 
much less than 2" hash evaluations. 

Multicollisions for random maps. In contrast to jSj, we consider generic at- 
tacks, and, accordingly, we model our functions as random maps. In that case, 
the number of i\r( r_1 )/ r is a lower bound on the sequential time required for 
finding a r-collision, and time-optimal algorithms are well-known. Furthermore, 
it is well-known how to find ordinary collisions (aka 2-collisions) with negli- 
gible memory (using Floyd, Brent or Nivasch [E] cycle finding algorithms), 
and also how to parallelize these algorithms using distinguished point meth- 
ods [18ll9l2UI23l24i25l2fil27 . 

In general, the issue of memory-efficient and parallelizable r-collision algo- 
rithms appears to be an unsolved question. Authors usually assume N^ r ~ 1 ' ) / r 
units of memory (i.e., the maximum any algorithm can use in the given amount 
of time) and neglect parallelization entirely. For recent examples of the applica- 
tion of multicollisions to cryptography, see, e.g., the cryptanalysis of the SHA-3 
candidates Aurora-512 |3|21 j and JH-512 [I 2I29[ . We stress that [3121 I I em- 
ploy generic multicollisions as a part of their attacks, always assuming maximum 
memory and ignoring the issue of parallel implementations. 

So the question is, do authors need to be so pessimistic, or are there memory- 
efficient and parallelizable algorithms for r-collisions? For small r, and mainly 
for r = 3, the current paper provides a clearly positive answer. As an applica- 
tion of our results, we will observe attacks on the SHA-3 candidate hash function 
Aurora-512. These attacks make heavy use of multicollisions on internal struc- 
tures. Some attacks on other SHA-3 candidates don’t benefit from our algorithms 
for different reasons. See section |B1 of the appendix. 

Notation. To avoid writing cumbersome logarithmic factors, we often express 
running times using the soft-Oh-notation. Namely, 0(g(n)) is used as a short- 
hand for 0(g(n) ■ \og(g(n)) k ) for some fixed k. 
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2 Known Algorithm for 3-Collisions 

While the number of values that needs to be computed before a 3-collision can 
be formed is often considered and analyzed, e.g. in [13 Appendix B] or [221 , the 
known algorithmic method to find such a 3-collision is rarely considered in detail 
and is mostly folklore. In order to compare the new algorithms which we describe 
in sections El to El with existing algorithms, we thus give a precise description of 
the folklore algorithm, together with a larger variety of time/memory tradeoffs. 
Throughout this section, we fix two parameters a and (3 and consider 3-collisions 
for a function F defined on a set of cardinality N. The parameter a controls the 
amount of memory, limiting it to 0(N a ). Similarly, /3 controls the running time, 
at 0(N l3 ). Of course, these parameters need to satisfy the relation a< /?. 

We consider Algorithm [3 This algorithm is straightforward. First, it com- 
putes, stores and sorts N a images of random points under F. For bookkeeping 
purposes, it also keeps track of the corresponding preimages. Second, it computes 
N@ additional images of random points and seek each in the precomputed table. 
Whenever a hit occurs, it is stored together with the initial preimage in the 
sorted table. The algorithm succeeds if one of the N a original images is found 
twice more during the second phase and if the three corresponding preimages 
are distinct. In the formal description given as Algorithm Q we added an op- 
tional step which packs colliding values generated during the first step into the 
same array element. If this optional step is omitted, then the early collisions are 
implicitly discarded. Indeed, in the second phase, we make sure that the search 
algorithm always returns the first position where a given value occurs among 
the known images F( x). During the complexity analysis, we ignore the optional 
packing step since it runs in time N a and can only improve the overall running 
time by making the algorithm stop earlier. 

We now perform a rough heuristic analysis of Algorithm El where constants 
and logarithmic factors are ignored. On average, among the N@ images of the 
second phase, we expect that 7V a+ ^ -1 values hit the sorted table of N a elements. 
Due to the birthday paradox, after N°3 2 hits, we expect a double hit to occur. At 
that point, the algorithm succeeds if the three known preimages corresponding 
to the double hit are distinct, which occurs with constant probability. For the 
algorithm to succeed, we need: 


a + /3- 1 > a/2, 

as a consequence, to minimize the running time, we enforce the condition: 

a + 2/3 = 2. (1) 

For a = /3, we find a = (3 = 2/3 and obtain the classical folklore result with time 
and memory 0(N 2 / 3 ). Other tradeoffs are also possible. With constant memory, 
i.e. a = 0, we find a running time O(N). Another tradeoff with a = 1/2 and 
3 = 3/4 will be used as a point of comparison in section El 
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Algorithm 1. Folklore 3-collision finding algorithm 
Require: Oracle access to F operating on [0, N — 1] 

Require: Parameters: a < [3 satisfying condition Q 
Let N a < — |fiV“J 
Let Nfj < — 

Create arrays Img, Pri and Pr2 of N a elements. 

First step: 

for i from 1 to N a do 
Let a < — r [0, N - 1] 

Let Img[i] «■ — F{a ) 

Let Pri[i] < — a 
Let Pr2[i] <■ — fi- 
end for 

Sort Img, applying the same permutation on elements of Pri and Pr2 

Optional step (packing of existing collisions): 

Let i < — 1 
while i < N a do 
Let j < — i + 1 
while Img[i] == Imgfj] do 
if Pri [i] / Pn [j] then 
if Pr2[i] == _L then 
Let Pr 2 [t] Pnbl 

if Pr 2 [*] / Pri[j] then 

Output ‘3-Collision (Pri[i],Pr2[i],Pri[j]) under F’ and Exit 
end if 
end if 
end if 

Let j < — j + 1 

end while 

end while 

Second step: 

for i from 1 to Np do 
Let a * — r [0, N — 1] 

Let b < — F(a) 

if b is in Img (first occurrence in position j) then 
if Pri [j] ^ a then 
if Pr2 \j] == fi then 
Let PrgJU' * — a 

if ftsjlll ^ a then 

Output ‘3-Collision (Prqj], Pr2[j], a) under F’ and Exit 
end if 

end if 
end if 
end for 
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3 A New Algorithm for 3-Collisions 

Now equipped with an analysis of Algorithm [H we are ready to propose a new al- 
gorithm which offers different time-memory tradeoffs, which are better balanced 
for existing hardware. The basic idea is extremely simple: Instead of initializing 
an array with N a images, we propose to initialize it with N a collisions under 
F. To make this efficient in terms of memory use, each collision in the array 
is generated using a cycle finding algorithm on a (pseudo-) randomly permuted 
copy of F. Since each collision is found in time IV 1 / 2 the total running time of 
this new first step is N 1 / 2+a . 

The second step is left unchanged, we simply create N@ images of random 
points until we hit one of the known collisions. Note that, thanks to the new 
first phase, it now suffices to land once on a known point to succeed. As a 
consequence, we can replace condition Q] by the weaker condition: 

a + /3=l. (2) 

Since the running time of the first step is N l / 2+0 , it would not make sense 
to have (3 < 1/2 + a. Thus, we also enforce the condition a < 1/4. Under 
this condition, the new algorithm runs in time 0(N l ^ a ) using 0(N a ) bits of 
memory. In particular, we can find 3-collisions in time 0(N 3 / 4 ) using 0(N l / A ) 
bits of memory. This is a notable improvement over Algorithm Q which requires 
(/(TV 1 / 2 ) bits of memory to achieve the same running time. 

Note on the creation of the N a initial collisions. One question that frequently 
arises when this algorithm is presented is: “Why is it necessary to randomize F 
with a pseudo-random permutation ?” 

Behind this question is the idea that changing the starting point of the cycle 
finding algorithm should suffice to obtain random collisions. However, this is not 
true. Indeed, the analysis of random mapping (for example, see gj) shows that 
on average a constant fraction of points belong to a so-called “giant tree” . By 
definition, each starting point in the giant tree enters the main cycle in the same 
place. As a consequence, without randomization of F the corresponding collision 
would be generated over and over again and the 3-collision algorithm would not 
work. 

4 Detailed Complexity Analysis of Algorithms Q] and El 

In this section, we analyze in more details the complexity and success probability 
of algorithms Q] and El assuming that F is a random mapping. This detailed anal- 
ysis particularly focuses on the following problematic issues which were initially 
neglected: 

1. Among the N a candidates stored in Img and its companion arrays, which 
fraction can non-trivially be completed into a 3-collision? 

2. In the second step, when a value F(a) hits the array Img, what is the probability 
of obtaining a real 3-collision and not simply replaying a known value of a? 


352 A. Joux and S. Lucks 


Algorithm 2. Improved 3-colli; algorithm 

Require: Oracle access to F operating on [0, IV — 1] 

Require: Family of pseudo-random permutation 11k, indexed by K in K. 

Require: Parameters: a < 0 satisfying condition |2| 

Let N a < — 

Let Np * — 

Create arrays Img, Pn and Pr2 of N a elements. 

First step: 

for i from 1 to N a do 
Let K 4 — R K 

Use cycle finding algorithm on FoIIk to produce collision F o II k ( a) = FoIlKib ) 

Let Img[i] < — F o I7ic(o) 

Let Pn[i] 4— n K {a) 

Let Pr 2 [t] <— II K (b) 

end for 

Sort Img, applying the same permutation on elements of Pri and Pr2 

Optional step (packing of existing collisions): 

Let i < — 1 
while i < N a do 
Let j < — i + 1 
while Img[i] == Img[j] do 
if Pri [i] Pri [j] then 
if Pra[i] ^ Pri[)] then 

Output ‘3-Collision (Pn[i],Pr2[i],Pri]j]) under F' and Exit 
end if 

Let j <• — j + 1 

end while 

Let i 4 — j 

end while 

Second step: 

for i from 1 to Np do 
Let a < — r [0, N - 1] 

Let b 4 — F{a) 

if b is in Img (first occurrence in position j) then 
if Pri [j] ^ a then 
if Pr 2 [j] == _L then 
Let Pr 2 [i] 4 — a 

if Pr 2 [j] 5^ a then 

Output ‘3-Collision (Pn[j], Pr 2 [j], a) under F’ and Exit 

end if " 
end if 
end for 
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3. Which logarithmic factors are hidden in the O expression ? 

4. In the first step of Algorithm El how can we make sure that we never en- 
counter a bad configuration where the cycle finding algorithm runs for longer 
than d(N l / 2 )7 

To answer the first question, remark that each candidate stored into Img is a 
random point that has at least one preimage for Algorithm |T| or at least two 
preimages for Algorithm El According to ffl, we know that the expected fraction 
of points with exactly k distinct preimages is e _1 /fc!. As a consequence, if we 
denote by P& the fraction of points with at least k preimages, we find: 


-5/2 


The expected fraction of elements from Img which can be correctly completed 
into a 3-collision is P3/P1 « 0.127 for Algorithm [□ and P3/P2 « 0.304 for 
Algorithm El To compensate the loss, the easiest is to make the stored set larger 
by a factor of 8 in the first case and 3 in the second. 

We now turn to the second question. Of course, at this point, the candidates 
that cannot be correctly completed need to be ignored. Among the original set 
of N a candidates, we now focus on the subset of candidates that can correctly be 
computed and let N' a denote the size of this subset. Since in the second phase we 
are sampling points uniformly at random, the a posteriori probability of having 
chosen one of the two already known preimages is at most 2/k, where k is the 
number of distinct preimages for this point. Since k > 3, the a posteriori prob- 
ability of choosing a new preimage is, at least, 1/3. Similarly, for Algorithm [0 
the a posteriori probability of choosing a preimage distinct from the single orig- 
inally known one is at least 2/3. To offset this loss of probability, Ng should be 
multiplied by a constant factor of 3. 

The logarithmic factors involved in the third question are easy to find, they 
simply come from the sort and binary search steps. Note that when N a ■log(N a ) 
< the sort operation costs less than the second step and can be ignored. 
Moreover, as soon as a < /3, this bound is asymptotically achieved when N 
tends to infinity. However, the binary search appears within the second step and 
a real penalty is paid. 

If we are willing to spend some extra memory - blowing up the memory by 
a constant factor -, this cost can be eliminated using hashing techniques. To 
cover the case of N a ■ lag{N a ) = N@, we need a data structure with constant- 
time insert and lookup operations. One such data structure is “cuckoo hashing” , 
where lookup operations need worst-case constant time, and insert operations 
need expected constant time - as long as less than half of the memory slots are 
used mE However, for typical applications, the cost of the binary search ought 
to remain small, compared to the cost of evaluating the function F. Thus, in 
practice, we expect only a tiny benefit from using hash tables. 

2 Furthermore, delete operations only need worst-case constant time, and recent im- 
provements even enable update operations in worst-case constant time U_. 
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The simplest answer to the fourth question is to fix some upper bound on 
the allowed running time of each individual call to the collision through cycle 
finding algorithm. If the running time is exceeded, we abort and restart with a 
fresh permutation II k- With a time limit of the form A y/N and a large enough 
value of A, we make sure that each individual call to the cycle finding algorithm 
runs in time 0(N 1 / 2 ) and the probability of success is a constant close to 1 , say 
larger than 2/3. 

5 A Second Algorithm with More Tradeoff Options 

The algorithm presented in section 0 only works for memory up to A 1 / 4 . This 
limitation is due to the way the collisions are generated during the first step 
of Algorithm 0 In order to extend the range of possible tradeoffs beyond that 
point, it suffices to find a replacement for this first step. Indeed, the second 
step clearly works with a larger value of a, as long as we keep the relation 
a + (3 = 1. Of course, since no 3-collision is expected before we have performed 
A 2,/3 evaluations of F, the best we can hope for is an algorithm with running time 
A' 2 / 3 . Such an algorithm may succeed if we can precompute a table containing 
A 1 / 3 ordinary collisions. 

In this section, we consider the problem of generating A 1 / 3 collisions in time 
bounded by 0(A 2 / 3 ) using at most 0(N 1 ^ 3 ) bits of memory. Surprisingly, a 
simple method inspired from Heilman’s time-memory tradeoff j^j is able to solve 
this problem. More generally, for a < 1/3, this method allows us to compute N a 
collisions in time less than 0(A 1-a ) using at most 0(A“) bits of memory. The 
idea is to first build A“ chains of length A 7 ; each chain starts from a random 
point and is computed by repeatedly applying F up to the A 7 -th iteration. 
The end-point of each chain is stored together with its corresponding start- 
point. Once the chains have been build, we sort them by end-point values. Then, 
restarting from N a new random points, we once again compute chains of length 
A 7 , the difference is that we now test after each evaluation of F whether the 
current value is one of the known end-points. In that case, we know that the chain 
we are currently computing has merged with one chain from the precomputation 
step. Such a merge usually corresponds to a collision, the only exception occurs 
when the start-point of the current chain already belongs to a precomputed 
chain (a “Robin Hood” using the terminology of 123 ). Then, backtracking to the 
beginning of both chains, we can easily construct the corresponding collision. A 
pseudo-code description of this alternative first step is given as Algorithm £3 

Note that, instead of building two sets of chains, it is also possible to build a 
single set and look for previously known end-points. This alternative approach is 
a bit trickier to implement but uses fewer evaluations of F. However, the overall 
cost of the algorithm remains within the same order. 

Clearly, since each of the two sets of chains we are constructing contain A “ +7 
points, the expected number of collisions is 0(A 2a+27_1 ). Remembering that 
we wish to construct N a collisions, we need to let 7 = (1 — a)/2. The running 
time necessary to compute these collisions is A “ +7 = A^ 1+ “^ 2 . Note that, since 
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Algorithm 3. Alternative method for constructing N a collisions 
Require: Oracle access to F operating on [0, N — 1] 

Require: Parameter: a < 1/3 
Let 7 * — (1 — a)/2 
Let N a * — \N a J 
Let + — \N"> J 

Create arrays Start and End of N a elements. 

Create arrays Img, Pn and Pr2 of N„ elements. 

Construction of first set: 
for i from 1 to N a do 
Let a * — r [0, N — 1] 

Let Start [i] 4 — a 
for i from 1 to Ay do 
Let a < — F(a) 
end for 

Let End[i] < — a 

end for 

Sort End, applying the same permutation on elements of Start 
Construction of second set and collisions: 

Let t < — 1 
while t < N a do 

Let a < — r [0, N - 1] 

Let b < — a 
for j from 1 to A% do 
Let b < — F{b) 

if 6 is in End (first occurrence in position k ) then 
Let a' < — Start [fc] 
for l from 1 to A r 7 — j do 
Let a' < — F(a!) 

end for 

if a ^ o' then 

(Checks that a genuine merge between chains exists} 

Let 6 < — F{. a ) 

Let b' < — F(a') 
while b ^ b’ do 
Let a * — b 
Let a 1 < — b' 

Let b < — F(a) 

Let b' * — F(a') 
end while 
Let Img[i] < — b 
Let Pri[t] * — a 
Let Pr2[f] * — a' 

Let t * — t -f 1 
end if 

Exit Loop on j 

end if 
end for 
end while 

Return arrays Img, Pri and Pr2 containing N a collisions. 
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a < 1/3, we have (1 + a)/2 < 1 — a. As a consequence, the running time of the 
complete algorithm is dominated by the running time N® = N l ~ a of the second 
step. 


6 Parallelizable 3-Collision Search 

Since the computation involved during a search for 3-collisions is massive, it 
is essential to study the possibility of parallelizing such a search. For ordinary 
collisions, parallelization is studied in details in m using ideas introduced in 


We first remark that the algorithms we have studied up to this point are badly 
suited to parallelization. Their main problem is that a large amount of memory 
needs to be replicated on every processor which is very impractical, especially 
when we want to use a large amount of low-end processors. We now propose an 
algorithm specifically suited to parallelization. For simplicity of exposition, we 
first assume that N p ps TV 1 / 3 processors are available and aim at a running time 
Moreover, we would like each processor to use only a constant amount 
of memory. However, we assume that every processor can efficiently communicate 
with every other processor, as long as the amount of transmitted data remains 
small. It would be easy to adapt the approach to a network of small processors, 
with each processor connected to a central computer possessing 0{ N 1 / 3 ) bits of 
memory. 

As for ordinary collisions, the key idea is to use distinguished points. By def- 
inition, a set of distinguished points is a set of points together with an efficient 
procedure for deciding membership. For example, the set of elements in [0, M— 1] 
can be used as a set of distinguished points since membership can be tested us- 
ing a single comparison. Moreover, with this choice, the fraction of distinguished 
points among the whole set is simply M /N. Here, since we wish to have chains 
of average length N 1 / 3 , we choose for M an integer near A 2 / 3 . 

The distinguished point algorithm works in two steps. During the first step, 
each processor starts from a random start-point s and iteratively applies F 
until a distinguished point d is encountered. It then transmits a triple (s, d. L), 
where L is the length of the path from s to d, to the processor whose number 
is d (mod N p ). We abort any processor if it doesn’t find a distinguished point 
within a reasonable amount of time, for example, following what ca does for 
2-collisions, we may abort after 20 A/M steps. Once all the paths have been 
computed, we start the second step. Each processor looks at the triples it now 
holds. If a given value of d appears three or more times, the processor recomputes 
the corresponding chains, using the known length information to synchronize 
the chains. If three of the chains merge at a common position, a 3-collision is 
obtained. 

Of course, even with less than A' 1 / 3 processors, it is possible to do a partial 
parallelization. More precisely, given A 0 processors with 0 < 1/3, it is possible 
to find 3-collisions in time O(N 2 ^ 3 ~ 0 ). In that case, each processor needs a local 
memory of size O(N 1 ^ 3 ~ 0 ) to store all the triples it owns. 
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Algorithm 4. Parallelizable 3-collisions using distinguished points 
Require: Oracle access to F operating on [0, N — 1] 

Require: Number of processors N p < N 1 ^ 3 
Require: Identity of current processor: Id £ [0, N p — 1] 

Let M < — |"lV 2 / 3 J {M defines distinguished points} 

Let Umax = 20 |"lV 1 / 3 J 

Construction of triples 

Let — R [0, JV — 1]; a 
while L < Imax do 
Let a < — F(a)-, L •< — 

if a < M then 

Send triple T < — ( 

end if 
end while 

Acquisition of triples: 

Store received triples (s, d, L) in local arrays A, D, C numbered from 1 to A 
Sort D, applying the same permutation on elements of A and C 

Processing of triples: 

Let i < — 1 
while i < K do 

Let j < — i + 1 

while j < K and D\j] = D[i ] do 
Let j* — j + i 
end while 
if j > t + 3 then 

Let L * — max(£[t], ■ • ■ 
for l from L downto 0 do 
for k from * to j — 1 do 
if C[k ] >i then 

Let D[fc] * — A[fc]; A[fe] ^ — F(A[k])) 

(D[fe] overwritten to keep previous value of A [A;]} 

end for 

Check for 3 equal values in A [i - ■ ■ j — 1] with differing values of D 
If found, Output the 3-collision and Exit 
end for 
end if 

end while 


* — s; L < — 0 
-L + 1 

s,a,L) to processor a (mod N p ) and Exit Loop 


7 Extension to r-Collisions, for r > 3 

For r-collisions, recall that we need to evaluate F on approximately r! 1/,r N^ r ~ l ^ r 
points before hoping for a collision. When considering that r is a fixed value, 
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r! 1/,r is a constant and vanishes within the O notation. With this new context, 
Algorithm 0 is quite easy to generalize. Here, the important parameter is to cre- 
ate shorter chains and compute more of them. The reason for shorter chains is 
that (as in Heilman’s Algorithm 0) , we need to make sure that there are not too 
many collisions between one chain and all the others. Otherwise, the algorithm 
spends too much time recomputing the same evaluations of the random map, 
which is clearly a bad idea. To avoid this, we construct chains which are short 
enough to make sure that the average number of (initia0) collisions between an 
individual chain and all the other chains is a constant. Since the total number 
of elements in all the other chains is essentially N^ r ~ 1 ^ r , the length of chains 
should remain below N l / r . 

To achieve maximal parallelization when searching for an r-collision, N p s ss 

N (.r-2)/r 

processors are required. The integer M that defines distinguished points 
should be near N^ r ~ 1 ^ r . Each processor first builds a chain of average length 
N 1 / r (as before we abort after 20 N/M steps), described by a triple ( s,d,L ). 
Each chain is sent to the processor whose number is d (mod N p ). During the 
second step, any processor that holds a value of d that appears in r or more triples 
recomputes the corresponding chains. If r chains merge at the same position, a 
r-collision is obtained. 

Given N 6 processors with 9 < (r — 2) /r, it is possible to find r-collisions in 
time d{N^ T ~ 1 ' ) / r ~ e ). In that case, each processor needs a local memory of size 
0{N<- T ~ 2 )/ r - 0 ). 

With a single processor, the required amount of memory is 0{N^ r ~ 2 ^ r ). Thus, 
as r grows, the advantage of the single processor approach on the folklore algo- 
rithm (which requires 0{N^ r ~ 1 ^ r ) memory) becomes smaller and smaller. As a 
consequence, for larger values of r, it is essential to rely on parallelization. 

8 Conclusion 

In this paper, we revisited the problem of constructing multicollisions on random 
mappings and showed that it can be done using less memory than required 
by the folklore algorithm. For 3-collisions, the sequential running remains at 
O (TV 2 / 3 ) but the amount of memory can be reduced from 0(N 2 / 3 ) to 0(N 1 ^ 3 ). 
A remaining open problem is to determine whether this amount of memory can 
further be reduced. 

Furthermore, finding 3-collisions can be very efficiently parallelized. Given 
A’ 1 / 3 parallel processors, each equipped with constant memory, the problem 
can be solved in time 0(N 1 / 3 ). More generally for r > 3, we show how to 
generate r-collisions on N e processors, each with local memory O(N^ r ~ 2 ^ r ~ 0 ), 
in time 0(N^ r ~ 1 ' ) / r ^ e ). It is interesting to note that the cost of the parallelizable 
approach in the full-cost model m decreases as 9 grows. 


3 Of course, once a collision occurs, all the values that follow are colliding. However, 

we do not count these follow-up collisions. 
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A Practical Implementation 

Since we only performed a heuristic analysis of our algorithms, in order to show 
that they are really effective, we decided to illustrate our 3-collision techniques 
with a practical example. For this purpose, we construct a random function by 
Xoring two copies of the DES algorithm (with two different keys). More precisely, 
we let: 

F(x) = DES Kl (a:) © DESk 2 (», 

wherefl K x = (3322110077665544) i6 and K 2 = (362al9087/6e5d4c)i 6 . Since x 
is on 64 bits, the time and memory requirements of the folklore algorithm are 
around 2 43 . Where current computers are concerned, performing 2 43 operations 
is easily feasible. However, storing 2 43 values of x requires 2 46 bytes, i.e. 64 Ter- 
abytes. As a consequence, finding 3-collisions on F with the basic parameters 
of the folklore algorithm is probably beyond feasibility. Using a different time- 
memory trade-off, restricting the storage to 2 32 values would raise the time 
requirement to 2 48 operations. This is within the range of currently accessible 
computations. However, since the algorithm is not parallelizable, it would require 
a high-end computer. 

4 This keys might seem weird, but they should not have any special proper- 
ties. In truth, we intended to choose K% = (0011223344556677)i6 and K 2 = 
(081 92o,364c5d6e7 /) i6 , i-e., (8899aabbccddee f /)i6 with high bits stripped. Unfor- 
tunately, the first-named author made a classical endianness mistake while imple- 
menting the algorithm. 
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With the new algorithms presented in this paper, it becomes possible to com- 
pute triple collisions much more efficiently on the function F. For our implemen- 
tation, we chose M = 2 44 to define the distinguished points, which yielded chains 
of expected length 2 20 . The abort length was set at 8 times the expected length, 
rather than the factor 20 given in Algorithm 01 For computing the chains, we 
used a mix of 32 Intel Xeon processors at 2.8 GHz and 8 Nvidia CUDA cards 
(Tesla type). We collected a total of 35 447322 chains and obtained 3 078 699 
groups of three or more chains yielding the same distinguished endpoints. The 
largest group contained 36 chains, which shows that it would have been prefer- 
able to use slightly shorter chains. On processors only, this first phase would 
have taken about 94 CPU-days to run. On a single CUDA card, it would have 
taken 11.5 days. 

For simplicity of implementation, the second phase of the algorithm was only 
performed on Intel processors and not on CUDA cards. It took less than 18 
CPU-days to test all groups and it yielded the following triple-collisions: 

F(d332b9ba5e5a7dAe) = F(5168095d6532a/cc) = F(&084dcl5dce042a&), 
F(ca76//906d6587c/) = F(el/7/59a5757d016) = F(0285/58147e863c2), 
F(c3783e/30c86cc3d) = F(65/14d412/d91173) = F(1042d827e5078000). 

We would like to thank CEA/DAX0 for kindly providing the necessary com- 
puting time on its Tesla servers. 

B Applications 

B.l Collisions for the Hash Function AURORA-512 

AURORA is a family of cryptographic hash functions submitted to the NIST 
SHA-3 hash function competition jSj . Like the other members of the AURORA 
family, AURORA-512 employs different internal compression functions, each 
mapping a 256-bit chaining value and a 512-bit message block to generate a new 
256-bit chaining value. AURORA-512 is the high-end member of that family, 
maintaining an internal state of 512 bit. As required by the NIST, the authors 
of AURORA-512 explicitly claim “collision resistance of approximately 512/2 
bits” for AURORA-512. In other words, collision attacks must not significantly 
improve over the generic birthday attack, which takes roughly the time of 2 256 
hash operations. 

Internally, AURORA-512 works almost like the cascade of two iterated hash 
functions, except for one important extra operation: 

MF : {0, 1}" x {0, 1}” -f {0, 1}" x {0, 1}". 

See Algorithm 0 for a simplified description of AURORA-512. 

Every eighth iteration, MF is called to mix the two half-states. This seems 
to defend against the cascade-attack from 0: Between two MF-operations, one 

5 Commissariat a l’energie atomique, Direction des applications militaires. 
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Algorithm 5. AURORA-512: Hashing 8 message blocks. 
Require: Input Chaining Values (Left, Right) 6 ({0, l} 256 ) * 2 * * * * 
for i from 0 to 7 do 

Left < — Compress(Left, Message_Block(i)) 

Right < — Compress(Right, Message-Block (i)) 

end for 

(Left, Right) < — MF(Left, Right) 


can generate local collisions in each iteration in one of either the left string, or 
the right string. Thus, the adversary can get a local 2 s -collision. But to apply 
the attack from j^j, one would rather need a 2 128 -collision, so the attack fails. 

Assume, for a moment, that the adversary has generated a 2 7 -collision on Left 
in the first 7 iterations of the loop. For the right string, we have 2 7 different values 
Righti, Right 2 , . . . , Right 12 s- If two of them collide, a collision for AURORA-512 
has been found. For a fixed Message_Block(7), the chance of a collision, i.e. of 
j k with 

Compr ess (Right j , Message_Block(7)) 

Compress (Right fc, Message_Block(7)) 

is about 2 7 • (2 7 - 1) • 2 _1 /2 256 . By trying out 2 256 ~( 6+7) different values for 
Message_Block(7), we expect to find a collision. Note that this means to make 
2 7 calls to the function Compress. Hence, this attack takes the time of about 
2256-(6+7)+7 _ 2 250 compression function calls, plus the time to generate the 

2 7 -collision at the beginning. This is essentially the memoryless variant of the 

attack from j3], except that the authors of J3] actually generate a 2 8 -collision 

on Left, by exploiting the previous eight-tuple of message blocks. The attack is 
memory less, since the adversary only needs to generate 2-collisions on Left, and 
the claimed time is 2 249 . 

In pi- Ferguson and Lucks further propose an attack which uses local r- 
collision, instead of local 2-collisions. A similar attack has been proposed inde- 
pendently EH- Using eight local r-collisions allows to speed-up the attack to 
roughly 2 256 /r 7 compression function calls (plus the time to generate the re- 
quired r-collisions). suggest r = 9 (beyond that, computing the r-collisions 
becomes too costly) and claim time 2 234 - 5 , including the time to generate ten lo- 
cal 9-collisions. The price for the speed-up is utilizing a huge amount of memory, 
however. 

Our memory-efficient 3-collision allows a different time-memory tradeoff. The 
time is 2 256 /3 7 « 2 245 . Recall N = 2 256 , and set a := 1/16, 0 := 15/16 in 
Algorithm El In that case one local 3-collision requires time 2 240 , which we 
neglect. The memory requirements are down to 2 16 , i.e., almost negligible. 

It is also possible to use more general r-collisions to further improve this 
attack. For example, we can use 4-collisions obtained using the algorithm of 
sectional To simplify the comparison with previous attacks, we assume a single 
processor, i.e. set 0 = 0, however, with more processors, we would obtain an 
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even better attack. With this choice, a 4-collision on 256-bits is obtained in time 
2 192 using a memory of size 2 128 . The corresponding speedup is 4 7 . Similarly, 8- 
collisions on 256 bits are obtained in time 2 224 each, using 2 192 units of memory. 
The speed-up is 8 7 . Other trade-offs are possible. 

The results on collision attacks for AURORA-512 can be summarised as 
follows: 
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B.2 Attacks on Other Hash Functions 

Several attacks on several other SHA-3 candidates make heavy use of multicol- 
lisions, and it appears a natural idea to plug in our algorithms for reducing the 
memory consumption of these attacks. We actually tried to do so, but only suc- 
ceeded for Aurora-512. In the current section, we will explain why we failed for 
other obvious candidates. 

Several attacks, such as the attacks on Blender |14ll()j and on Twister EH, 
employ multicollisions, but it turns out that these can actually be generated by 
Joux-style iterated 2-collisions, which is very memory-efficient - and also faster 
than our general multicollision algorithms, anyway. 

An obvious candidate to employ our algorithms to improve given cryptanalytic 
attacks is a preimage attack on JH-512 EH- Like Aurora, JH is a family of hash 
functions submitted to the SHA-3 competition. The high-end 512-bit variant 
is denoted as JH-512. Internally, JH-512 is a wide-pipe hash function with an 
internal state of 1024 bit, and it employs an invertible compression function. IE! 
propose a meet-in-the-middle attack which requires “2 510 - 3 compression function 
evaluations and a similar amount of memory” (our emphasis). The authors of 
EH stress: “We do not claim that our attack breaks JH-512 (due to the high 
memory requirements).” The author of JH-512 provides a more detailed analysis 
of this attack, claiming “2 510 - 6 [units of] memory”. A main phase of the attack 
is generating several 51-collisions on one half of the chaining values (i.e., on 
512 bits). By applying our algorithms to this task, it is possible to reduce the 
memory required for this phase to 2( 512 / 51 )' 49 units of memory. 

But another phase of the attack from EH is to apply the inverse of the com- 
pression function to generate 2 509 internal target values. The attack successfully 
generates a message which hashes to a given preimage, if the first part of the 
message hashes to any of these 2 509 target values. Finally, the overall amount 
of storage for the attack is dominated by storing these 2 509 values, regardless of 
improving memory-efficiency of the multicollision phase. 
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Abstract. The design of cryptographic hash functions is a very complex 
and failure-prone process. For this reason, this paper puts forward a 
completely modular and fault-tolerant approach to the construction of a 
full-fledged hash function from an underlying simpler hash function H 
and a further primitive F (such as a block cipher), with the property 
that collision resistance of the construction only relies on H, whereas 
indifferentiability from a random oracle follows from F being ideal. In 
particular, the failure of one of the two components must not affect the 
security property implied by the other component. 

The Mix-Compress-Mix (MCM) approach by Ristenpart and Shrimp- 
ton (ASIACRYPT 2007) envelops the hash function H between two in- 
jective mixing steps, and can be interpreted as a first attempt at such a 
design. However, the proposed instantiation of the mixing steps, based 
on block ciphers, makes the resulting hash function impractical: First, it 
cannot be evaluated online, and second, it produces larger hash values 
than H, while only inheriting the collision-resistance guarantees for the 
shorter output. Additionally, it relies on a trapdoor one-way permutation, 
which seriously compromises the use of the resulting hash function for 
random oracle instantiation in certain scenarios. 

This paper presents the first efficient modular hash function with 
online evaluation and short output length. The core of our approach 
are novel block-cipher based designs for the mixing steps of the MCM 
approach which rely on significantly weaker assumptions: The first mix- 
ing step is realized without any computational assumptions (besides the 
underlying cipher being ideal), whereas the second mixing step only re- 
quires a one-way permutation without a trapdoor, which we prove to be 
the minimal assumption for the construction of injective random oracles. 

1 Introduction 

Multi-Property Hash Functions. Cryptographic hash functions play a cen- 
tral role in efficient schemes for several cryptographic tasks, such as message au- 
thentication, public-key encryption, digital signatures, key derivation, and many 
others. Yet the huge variety of contexts in which hash functions are deployed makes 
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the security requirements on them very diverse: While some schemes only assume 
relatively simple properties such as one-wayness or different forms of collision re- 
sistance, other schemes, including practical ones such as OAEP iElandPSSH, 
are only proven secure under the assumption that the underlying hash function is 
a random oracle jSJ, i.e., a truly random function which can be evaluated by the 
adversary. On the one hand, while a number of prov ably- secure collision-resistant 
hash functions, such as VSH |3 or SWIFFT JEj, have been designed, they are 
not appropriate candidates for random oracle instantiation. On the other hand, 
well-known theoretical limitations f8tl i)| only permit constructions of hash func- 
tions for random oracle instantiation from idealized primitives m , such as a fixed- 
input-length random oracle or an ideal cipher^ but (as first pointed out in |2|) these 
constructions may lose any security guarantees as soon as the adversary gets to ex- 
ploit non-ideal properties of the underlying primitive]! 

While one could in principle always employ a suitable hash function tailored at 
the individual security property needed by one particular cryptographic scheme 
at hand, common practices such as code re-use and the development of standards 
call for the design of a single hash function satisfying as many properties as 
possible. This point of view has also been adopted by NIST’s on-going SHA-3 
competition , and motivated a series of works m shifting the design problem 
of multi-property hash functions to the task of constructing good multi-property 
compression functions. A further line of research has been devoted to robust 
multi-property combiners H3, which merge two hash functions such that the 
resulting function satisfies each of the properties possessed by at least one of 
the two starting functions. While these works simplify the design task, building 
multi-property hash functions from single - property primitives remains far from 
being simple, and is the main topic of this paper. 

Statement of the Main Problem. This paper presents a modular design 
for hash functions that are collision resistant in the standard model and can, 
simultaneously, be used for random oracle instantiation in the ideal model. We 
consider a setting where both a hash function H as well as some other (po- 
tentially ideal) primitive F (such as a block cipher) are given (a similar setup 
was previously considered by Ristenpart and Shrimpton | 23 |) : We aim at de- 
vising a construction C H,F which is collision resistant as long as H is collision 
resistant]! and which behaves as a random oracle (with respect to the notion of 
indifferentiability j l HI 1 ( )| 1 whenever F is ideal. For this approach to be practi- 
cally appealing, the construction must preserve the good properties of H: For 
instance, it must allow for online processing of data (which is crucial for large 


1 An ideal cipher E : {0, 1}* X {0, l} n — ► {0, l} n associates an (invertible) random 
permutation E(fc, •) with each key k. 

2 Of course, a real block cipher cannot be ideal. (Likewise, a hash function cannot be a 
random oracle either.) Yet modeling it as ideal captures the adversary’s inability of 
exploiting any structure, and a security proof in this model implies in particular the 
inexistence of any generic attacks treating the block cipher as a black box. 

3 In particular, we require the existence of a standard-model reduction. 
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inputs or in streaming applications) whenever H can be evaluated onlineJ3 Also, 
the construction should not increase the size of the hashes of H. 

In particular, we advocate a safe and modular design paradigm where each of 
both properties should ideally rely only on one of both component primitives, 
whereas the other primitive may be arbitrarily insecure, except for (possibly) 
satisfying some minimal structural requirement (that can be ensured by design), 
such as F being a permutation or H being sufficiently regular. This differs from 
the point of view taken in | 23 |, where H is guaranteed to be collision resistant 
and is extended by means of an ideal primitive F into a random oracle, while 
preserving the collision-resistance guarantees of H : We believe that practical 
considerations, especially efficiency, may in fact motivate the use of hash func- 
tions with no provable security guarantees. Thus, it is desirable that even the 
ability of finding collisions for H does not impact the indifferentiability of the 
construction, as long as F is still ideal. Either way, both points of view are re- 
lated: Any solution satisfying our stronger requirements (including the one we 
propose in this paper) also fits within the framework of while the solution 
proposed in m also satisfies stronger requirements, as discussed below. 

We also remark that using the multi-property combiner of 0 one can com- 
bine a random oracle (built from F) and H into a hash function that provably 
observes both properties. However, as combiners inherently do not exploit the 
knowledge of which one of both functions has a certain property, the resulting 
construction is rather inefficient, e.g., it doubles the output length. 

The MCM Approach. Given a hash function H as above, the so-called mix- 
compress-mix (MCM) approach, introduced by Ristenpart and Shrimpton E3|, 
considers the construction 

MCM Mi ’ Mi ’ h (:e) := M 2 (H(M 1 (x))), 

where Mi and M 2 are arbitrary-input-length injective maps (the so-called mixing 
stages) with stretch t\ and t 2 , respectively, i.e., such that M, outputs a string 
of length \x\ + t, on input x G {0, 1}*. The injectivity of the mixing stages 
ensures that MCM preserves the collision resistance of H in the standard model. 
Additionally, it was shown in m that MCM is indifferentiable from a random 
oracle if Mi and M 2 are random injective oracles (i.e., Mi returns a random 
(|cc| + Tj)-bit string for each input x G {0,1}* that differs from all previously 
returned values with the same length) and H is collision resistant and sufficiently 
regular. Dodis et al. m subsequently interpreted this result as the combination 
of two facts: (i) The mapping x i— > H(Mi(x)) is preimage a/wo/nf^ under the same 


4 Most hash functions rely on some iterated (and thus inherently online) design, such 
as Merkle-Damgard (Him or sponges Q. 

5 Informally, a construction C F based on an ideal primitive F is preimage aware if 
there exists an algorithm - called the preimage extractor - which given the input- 
output history of F and an output y, either aborts or returns x such that C F (x): m y, 
and after such query no adversary can find an input x' such that C E (a/) = y (and 
x' ^ x in case the extraction query did not abort). 
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assumptions, and (ii) Post-processing the output of a preimage-aware function 
with a (possibly injective) random oracle yields a full-fledged random oracle. A 
concrete instantiation of injective random oracles - called the T E-construction 
- relying on an ideal cipher and a trapdoor one-way permutation has also been 
proposed in m- To date, this was the only known such construction. 

Interestingly, we observe that the MCM approach provides a modular design 
approach for hash functions as advocated above, since the indifferentiability 
result can be made independent of the collision resistance of H . (This was unno- 
ticed in m, and is briefly discussed in the full version of this paper.) However, 
its deployment is subject to a number of practical and theoretical drawbacks, 
whose solution was stated as an open problem in ESI: First, every construction 
of injective random oracles (and in particular the T E-construction) cannot be 
online, as, roughly speaking, each output bit needs to be influenced by all of 
the input in order to exhibit random behavior. Additionally, the fact that the 
T E-construction is length-increasing has a serious impact on the resulting hash 
size: In particular, the stretch r* typically equals the bit length of a sufficiently 
secure RSA modulus, i.e., t* > 2048 bits for reasonable security. Finally, the use 
of a trapdoor one-way permutation within the T E-construction is rather unde- 
sirable: In contrast to (non trapdoor) one-way permutations, the assumption is 
very strong, e.g., it implies public-key encryption in the random oracle model El- 
Also, as pointed out in m, the compositional guarantees of protocols using the 
MCM approach (with the T E-construction) to instantiate a random oracle are 
affected, as properties such as deniability may be lost (cf. e.g. the works by 
Pass E2| and by Canetti et al. jZ| ) . 

These observations give rise to a number of challenging open questions. Can we 
instantiate the first mixing stage of MCM with a weaker primitive which allows 
for online processing? Can we instantiate the second mixing stage (where online 
processing is not an issue) as an injective RO with limited stretch (possibly even 
with no stretch at all)? And finally, can we weaken the underlying assumption, 
eliminating the need of the trapdoor, or possibly even entirely removing the 
underlying assumption? 

Contributions and Roadmap of this Paper. In this paper, we present 
the first efficient modular construction of a hash function in the sense described 
above. Our solution relies on the MCM approach, and in particular we address 
and solve all of the aforementioned open questions, and hence make a substantial 
step towards making the MCM approach practical. 

First Mixing Stage. In Section E3 we present a novel mode of operation for a 
block cipher E : {0, l} 2 " x {0, 1}" —*■ {0, 1}" implementing an arbitrary-input- 
length injective map - called iterated mix (IM) - that permits online processing 
of its inputs, making only one call to E per n-bit message block, and has only 
stretch n/2. Our first main theorem shows that the construction IMC E,H (M) := 
H(\VA e (M)) applying H to the output of IM is preimage aware if E is an ideal 
cipher and, additionally, the hash function H satisfies a rather weak regularity 
requirement (which is somewhat incomparable to the one used in E31, albeit 
equally natural): Namely, given a random n-bit string m and some arbitrary 
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string S, the value H(S\\m ) has (min-)entropy not much lower than n (if n is 
smaller than H' s hash size), or not much lower than the hash size otherwise. In 
fact, even completely insecure hash functions can have this property, and it is 
also natural to assume that it is satisfied by any reasonably built hash function. 
We also present a variant of the IM-construction which requires a block cipher 
with single-block key length n at the price of making two block-cipher calls per 
message block. 

We stress that (contrary to the T E-construction) our result does not rely 
on any computational assumptions: In particular, the IM-construction relies on 
invertible primitives, and is itself efficiently invertible. Thus, IM does not imple- 
ment a random injective oracle. 

Second Mixing Stage. With the goal of making the MCM approach preserve the 
hash size of the underlying hash function in mind, the second part of this paper 
(Sectional) addresses the question of building length-preserving injective random 
oracles. (We call this a (non-invertible) random permutation oracle (RPO).) 
We show that for any three permutations E,E',n from n bits to n bits, the 
permutation 

;= E ’{-k{E{x))) 

is indifferentiable from a RPO if both E and E' are (fixed-key) ideal ciphers, 
and 7r is a one-way permutation, without a trapdoor. 

In practice, E. E' are instantiated by a block cipher with two distinct fixed 
keys. This limits us to n being a valid block size (e.g. n = 128 bits), which 
can be smaller than the usual hash size (e.g. h = 256). This motivates the 
question of extending the input/output size of random permutation oracles: In 
Section 14.21 we present constructions (which are reminiscent of the Shrimpton- 
Stam compression function m for extending every n to n bits RPO into a 7 • n 
bits to 7 ■ n bits RPO for any fixed 7 > 1. 

In the full version we further show that in order to construct injective ROs the 
assumption of a one-way permutation cannot be weakened to a one-way function 
(at least under black-box security reductions). 

Putting Pieces Together. Finally, instantiating MCM with IM and NIRP (or its 
extension through our extender) as its first and second mixing stage, respectively, 
leads to the first construction of a hash function with the following properties: 

(i) Its collision resistance can be reduced in the standard model to the collision 
resistance of the underlying hash function. 

(ii) It is indifferentiable from a random oracle in the ideal cipher model (with 
a one-way permutation), as long as the underlying hash function is suffi- 
ciently regular. 

(iii) It can be evaluated online as long as the underlying hash function can be 
evaluated in an online fashion. 

(iv) It has hash size equal to the one of the underlying hash function. 

(v) It can be used to instantiate a random oracle in all computationally secure 
schemes in the random oracle model, with no composability limitations. 
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2 Preliminaries 

Notational Preliminaries. Throughout this paper, {0, 1}" denotes the set 
of strings s of length |s| = n, whereas ({0,1}")* and ({0,1}") + are the sets of 
strings consisting of n-bit blocks with and without the empty string, respectively. 
The notation s||s' stands for the concatenation of the strings s and s'. Also, we 
use INJ(m, n) to denote the set of injective functions / : {0, l} rn — > {0, 1}" (in 
particular, IN J(n, n) is the set of permutations from n bits to n bits). Further, it is 
convenient to define BC(«;, n) as the set of block ciphers, i.e., of keyed functions E : 
{0, 1} K x {0, 1}" — y {0, 1}" such that each key k G {0, 1} K defines a permutation 
-Efc(-) := E(k, ■) G INJ(n, n) (and denote as E~ 1 (k, •) the corresponding inverse). 

Algorithms are in general randomized, and throughout this paper we fix a 
RAM model of computation for these algorithms. We use the notation A°(r) 
to denote the (oracle) algorithm A^ which runs on input r with access to the 
oracle O. In particular, an algorithm is said to have running time t (also 
denoted as time (A) = t ) if the sum of its description length and the worst-case 
number of steps it takes (counting oracle queries as single steps), taken over 
all randomness values, all inputs and all compatible oracles, is at most t. If 
the algorithm takes inputs of arbitrary length, then time(A, i) refines the above 
notion to only take the maximum over inputs of length at most i. 

Finally, the shorthand x r- S stands for the action of drawing a fresh random 
element x uniformly from the set S, whereas x <— A° (r) denotes the process of 
sampling x by letting A interact with O on input r (and probabilities are taken 
over the random coins of A and O). 

One-Way Functions and Permutations. We define the one-way advantage 
of an adversary A against a function / : {0, l} m — * {0, 1}" as 

AdV} wf (M) = P[x A {0, l} m , x' A(f{x)) : f(x) = f(x')\. 

For the special case of a permutation n : {0, 1}" — > {0, 1}", it is convenient to 
use the shorthand Adv° wp (A) = P[a: {0, 1}", x' A(n(x)) : x = x'] for the 
one-way permutation advantage. 

Idealized primitives. We consider a number of (more or less) standard ide- 
alized primitives throughout this paper, which are always denoted by bold- face 
letters. For a set X, a random oracle (RO) R : X — ► {0, 1}" is a system associ- 
ating a random n-bit string R(a:) with each input x. If X = {0, l} m , then R is 
called a fixed-input-length RO (FIL-RO), whereas it is a variable-input-length RO 
(YIL-RO) if A = {0, 1}*. An ideal cipher (IC) E : {0, 1} K x {0, 1}" -f {0, 1}" 
is a block cipher E chosen uniformly from the set BC(k, n), and allows both 
forward queries E (k,x) as well as backward queries E ~ 1 (k,y). If k = 0, then 
we omit the first input and we call this a fixed-key ideal cipher. Note that for 
an IC E and distinct fixed key values ko, k\ , . . ., E(fco, •), E(fei, •), . . . are inde- 
pendent fixed-key ICs. In contrast, a (fixed-input-length) random injective oracle 
(FIL-RIO) I : {0, l} m — ► {0, 1}" implements a uniformly chosen function from 
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INJ(m, «). In the special case m = n we call this a random permutation oracle 
(RPO) P : {0,1}" {0,1}". 

We stress that the substantial difference between a fixed- key IC and a RPO 
is that the former allows for inversion queries, whereas the latter does not (and 
is in particular hard to invert). 

Indifferentiability. The notion of indifferentiability was introduced by Mau- 
rer et al. m to generalize indistinguishability to constructions C F : X — > {0, 1}" 
using a public (idealized) primitive F (e.g., an IC, a FIL-RO, or a combination 
of these), i.e., that can be accessed by the adversary. Roughly speaking, C F is 
indifferentiable from an ideal primitive F' if there exists a simulator S F access- 
ing F' such that (C F ,F) and (F , , l S F ) are indistinguishable. In particular, we 
will be concerned with the cases where F 7 is either a RO or a RIO/RPO, and 
we define the RO-indifferentiability advantage of the distinguisher V against the 
construction C F and simulator S as the quantity 

Advi- n F d J(T>) = | P [D cF ’ F = l] — P [d r ’ sR = l] | , 

where R : X — > {0, 1}" is a RO with the same input and output sets as C. 
The IRO-indifferentiability advantage Advjr’™ is defined analogously by using 
a RIO I instead of R. We stress that both quantities are related by a simple 
birthday-like argument, i.e., Adv™4's ro (l>) < Advp d J (V) + \ ■ (q+ q s ) 2 ■ 2“", 
where q is the number of query V makes to its first oracle, whereas qs is the 
overall number of queries S makes when answering D’s queries. Note that indif- 
ferentiability ensures composability , i.e., if a cryptographic scheme is secure using 
an ideal primitive F 7 accessible by the adversary, then it remains secure when 
replacing F 7 with a construction C F which is indifferentiable from F 7 and letting 
the adversary access F. See |1 f)|1 f)j for a formal treatment in the information- 
theoretic and computational models. 

Collision-Resistance. Let H : K x {0, 1}* — > {0, l} h be a (keyed) hash func- 
tion with key generator 1C. The collision-finding advantage of an adversary A is 

Advg(A) := P[k 4 - /C, (M, M') 4 - A(k) : M ± M' A H k (M) = H k (M')\ 

The notion naturally extends to keyless hash functions (which can be consid- 
ered in the same spirit proposed in P3) and to constructions from some ideal 
primitive F (where A is additionally given access to F) . 

The MCM-Construction. For a hash function H : {0,1}* — > {0,1}^, and 
injective maps Mi : {0,1}* — * {0,1}*, M 2 G INJ(h 7 ,n), where n > b! > h, the 
MCM -construction implements a map {0, 1}* — ► {0, 1}" as 

:= M 2 (fl‘(M 1 (M)) || 0 h '~ h ). 

We also define MCM^’ h ’ M2 := MCM Ml ’ Rfe ’ M2 for all keK if the hash function 
H is keyed (with key space K) . Also, the definition does not allow Mi, M 2 to be 
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keyed (in contrast to 0 ). This is because we will present keyless instantiations 
of Mi, M 2 . Note that we assume M 2 to be fixed input length without loss of 
generality. The following simple result was shown in |23|, and holds both for 
keyed as well as for keyless hash functions. 

Lemma 1. For all collision-finding adversaries A outputting a pair of mes- 
sages each of length at most l, there exists a collision-finding adversary B such 
that Adv^ r CM M 1 ,H,M 2 (i) = Adv^(B), where time(S) = time(M) + 0(2(£ + 
time(Mi,l))). 

Preimage Awareness. We briefly review the notion of preimage awareness E2 
for a hash function H F : {0, 1}* — ► {0, 1 } h built from an idealized primitive F. A 
preimage extractor £ is a (deterministic) algorithm taking a history a of input- 
output pairs of F and a value y G {0. l} h such that £(a,y) returns a value 
x G {0, 1}* Li {JL}. We consider a random experiment (called the pra -game) 
involving an adversary A which can query both F and £ (a, ■) (where a is the 
current history containing the interaction with F so far, i.e., the adversary cannot 
change the first argument), and where a set Q contains all ^-queries y of A and 
an associative array V stores as V[y] G {0, 1}*U{_L} (for all y G Q) the answer of 
the query y to £. The pra -advantage of the adversary A with preimage extractor 
£, and primitive F is the quantity 

AdvS™ |£ (A) := P[(M, y ) 4- .A 5 (“'->’ F : y G Q A H F (M) = y A V[y] + M\. 

It turns out that preimage aware functions are good domain extenders for FIL- 
ROs: More concretely, with H as above, consider the construction C F,R : M >—> 
R/(H f (M)) for a FIL-RO R/ : {0, l} h — + {0, 1}". Then, the following result was 
proved in 1121- 

Lemma 2 (PRA + FIL-RO = VIL-RO [El). There exists a simulator S 
such that for all distinguishers V making q queries to C F,R of length at most l, 
qi queries to F and q 2 queries to R 7 , there exists an adversary A with 

Advpfi? JV) < Adv^ JA). 

The simulators runs in time 0(q\ + < 72 • time (5)) and makes qi queries, whereas 
A runs in time time(T > ) + 0(q ■ time(H,f!) + qo + qi ) and makes q ■ qn,e + qi 
F-queries and qi extraction queries, where qn,e is the maximal number of oracle 
queries made by H to process an input of length at most l. 

3 An On-Line Mixing Stage: The IMC-Construction 

3.1 Description 

The IM-Construction. The iterated mix construction (or I M-construction for 
short), depicted in Figure [IJ relies on a block cipher E : {0,1} 2 " X {0,1}" — ► 
(0, 1}" and an injective mapping PAD : {0, 1}* — > {0, l}"/ 2 x ({0, 1}")* which 
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Fig. 1. The IM-construction with block cipher E : {0, l} 2n X {0, 1}" — > {0, 1}" 

pads every string so that it consists of one n/2-bit block, followed by as many 
n-bit blocks as necessary^ On input M G {0, 1}*, it first obtains PAD(M) = 
mi|| . . . ||to£, and computes the output y-[\\ . . . \\yt iteratively such that y\ := 
S(IY||m2,0” /,2 ||mi) (where IV is an n-bit fixed initialization value ) and yi := 
E(yi-i\\mi+i,mi) for all i = 1, . . . ,£, where mg+i := 0". 

In contrast to the T E-construction of m, the IM-construction is iterated and 
allows for (essentially) online processing, with the minimal restriction that only 
the first i— 1 output blocks yi, ... . i can be computed from the first i message 
blocks mi, ... , rrii. This one-block-lookahead evaluation strategy only marginally 
impacts the efficiency of the construction, and is crucial in order to ensure the 
desired security requirements. 

Injectivity of the IM-Construction. It is not difficult to see that the con- 
struction is injective: Given an output yi\\ . . . \\ye (for some £) we can iteratively 
efficiently reconstruct the padding mi|| . . . ||m^ of the input M by computing 
rrii := .E -1 (?/j_i||mj+i,mj) for all i = i,£— 1, ... ,2, with m^ + i = 0", and finally 
0"/ 2 ||mi := £' _1 (IV||m 2 , yi). Thus, IM cannot be a VIL-RIO, and not even one 
way, even though it is surprisingly still strong enough to instantiate the first 
mixing step of the MCM approach, as we show below. 

The I MC- Construction. It is convenient to define the combination of the IM- 
construction and a hash function H as the iterated mix- compress construction 
(or IMC-construction, for short), which, on input a string M G {0,1}*, outputs 
IMC e ’ h (M) := H(\M e (M)). If H is keyed, then we similarly define the keyed 
function IMC := IMC E,Hk (M). Note that if H can be evaluated online, 
then this is the case for the IMC-construction as well. 

Shorter Key Size. The use of a block cipher with key length equal twice the 
block length is acceptable in practice Q Still, in oder to ensure compatibility with 
a larger number of block ciphers, we propose an alternative construction (called 
the DM-IM-construction) which relies on a block cipher E : {0, 1}" x {0, 1}" — > 
{0, 1}", at the cost of making two calls per processed message block. The un- 
derlying idea consists of producing an n-bit key value at each round by using 

6 This can be done in the canonical way by appending the bit 1 followed by as many 
0 bits as necessary in order to fulfill the length requirement. 

7 For instance, AES supports key size 256 bits with block length n = 128 bits. 
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the Davies-Meyer construction on ?/j_i and m»+i: More precisely, we compute 
2/i := E(E(m, 2 ,TV) © IV, 0"/ 2 ||mi) and t/i := E{E{mi + \,yi-\) © yi-\,mf) for 
alii = 2, . . . ,£. As above, for a hash function H, we define DM-IMC B,J? (M) := 
i7(DM-IM K (M)). (And analogously for the keyed case.) 

3.2 Preimage Awareness 

The purpose of this section is to prove that, for an ideal cipher E : {0, l} 2 ” X 
{0, 1}” — > {0, 1}", the construction IMC E,H is preimage aware, provided H sat- 
isfies very weak randomness-preserving properties that we discuss first. 

Hash Function Balance. The IMC-construction does not exhibit any useful 
properties if H can be arbitrary (consider e.g. the case where H is constant). It 
is nevertheless reasonable to assume H to satisfy minimal structural properties 
which could be (and generally are) ensured by design. In particular, we require 
H to preserve some of the randomness of a uniformly chosen input m of a 
given length n (where n is e.g. the block length of the cipher used in the IM- 
construction), and this should hold even if m is appended to some other fixed 
input string M. 

Definition 1. An (unkeyed) hash function H : {0, 1}* — > {0, l} h is (e, ^-prefix- 
balanced if for all messages M £ ({0, 1}”)* and hash function outputs y £ 
{0, \} h we have P[m {0, 1}” : H(M\\m) = y] < e. 

The notion extends naturally to a keyed hash function H : {0, 1} K X {0, 1}* — *■ 
{0, l} h : We say that it is (e, n)-prefix balanced if for all keys k the function ify 
is (e(k), n )- prefix balanced, and P (k) ■ e(k) < e, where P(fc) is the probability 

that the key generator samples the key k. We remark that the best e one can 
hope for is e = 2 -n as long as n < h holds, whereas e > 2~ h for n> h. Note that 
our notion is somewhat incomparable to the one of m, where on the one hand 
balancedness under variable input lengths is considered (rather than for some 
fixed length n, as in our case), but, on the other hand, the property is not required 
under prepending of fixed prefixes: Still we find this extension to be natural in 
a hashing scenario. It is important to realize that prefix balancedness does not 
imply any useful security properties for H: The function H : ({(), 1}") + — ► 
{0, 1}” such that H(M\\m) := m for all n-bit strings m and all M with length 
multiple of n is (n. 2 _n )-prefix-balanced, despite finding collisions or preimages 
in this function being trivial. 

Main Theorem. The following theorem is the main result of the first part of 
this paper: It provides a concrete characterization of the security of the IMC- 
construction in the ideal-cipher model. We stress that the result only relies on 
E being an ideal cipher, and H being sufficiently balanced, but no computa- 
tional assumption is made, i.e., the result holds with respect to computationally 
unbounded adversaries. 

Theorem 1 (Preimage Awareness of IMC). Let E : {0,1} 2 " X {0,1}" — > 
{0, 1}" be an ideal cipher and let H : {0, 1}* — > {0, l} h be an (e, n) -prefix-balanced 
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hash function. There exists a preimage extractors (given in the proof ) such that, 
for all adversaries A issuing at most q queries to E and qs queries to £, we have 

Adv fMC E ’" E,^" 4 ) ^ 3 • Q(Q + !) • 2_(n+1) + q • 2 - "/ 2 + q(q + 2 q e ) • §. 
Furthermore, £ answers an extraction query in time 0(\a\ ■ log |a|). 

The result extends naturally to a keyed hash function by just averaging the 
bound over all choices of the key. The security of IMC is bounded by (roughly) 
min{2"/ 2 , \/e}, and is not worse than the one in the T E-construction (which 
additionally relies on the security of the underlying trapdoor one-way permuta- 
tion). Note that Theorem [I] is concerned with the entire IMC-construction: An 
interesting (and seemingly challenging) open question consists of distilling the 
(minimal) properties needed by I M to yield preimage awareness for IMC. 

The remainder of this section is devoted to the proof outline of Theorem [D 
Technical details are postponed to the full version, as well as a discussion on 
how to obtain similar bounds for DM- IMC. 

Interaction Graphs. An interaction with the ideal cipher E can be described 
in terms of the history a, consisting of triples ( k,x,y ), where k £ (0, l} 2 ", and 
x, y £ (0, 1}". Both a forward query E(fc, x) with output y and a backward query 
E _1 (k, y) with output x result in a triple (k, x, y) being added to However, 
it is far more convenient to describe a in terms of a directed (edge labeled) graph 
G = G(a ) = (V, E) with vertex set V := {0, 1}” and edge set E C V x V such that 
( 7 /, y') £ E with labels label(y, y') = mand next(y, y') = m! if (i) y') £ a 

with y ^ IY or (ii) (y\\m', 0"/ 2 [|m, y') £ a if y = IV. A (directed) path IV = yo — > 
yi — > • • • — > ye in Gis called valid if for all i = 1 we have label (y 7 ; , yi+i) = 

next(j/j_i, yf). It is additionally called complete if next(y^_i, yf) = 0”. The value 
of a complete valid path is defined as H(yi\\ . . . \\yt), and its preimage is the string 
M which is padded to label(yo, 2/i)|| • • • ||label(y^_i, ye). 

The Preimage Extractor £. On input a history a and a (potential) output 
z £ {0, l} h of IMC, the preimage extractor £ first computes the subgraph G’ of 
G(a) induced by the vertices which are reachable through a valid path. If G' is 
not a directed tree, then £ aborts and outputs _L. Otherwise, if G' contains one 
single valid complete path with value z and preimage M, it outputs M. In any 
other case, it outputs JL. 

It is not hard to see that £ can be implemented with running time 0(\a\ ■ 
log |a|) (i.e., where a approximately equals the number of edges in the graph 
G(a)) due to the fact that £ aborts if G' is not a tree: Otherwise, the number 
of possible valid paths may be very high, even exponential]^ 

8 The actual history used in the definition of preimage awareness indeed contains more 
information, such as whether the triple is added by a forward or by a backward query, 
but this is irrelevant in the following. 

9 One may argue that we are taking a rather conservative approach: Even if the graph 
were not a tree, it would most likely have a limited number of valid paths. Still, this 
considerably simplifies the security analysis with no noticeable loss in the obtained 
bounds. 
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Proof Intuition. Assume without loss of generality that the adversary A 
never repeats a query twic43 and that whenever it terminates in the pra-game 
outputting a pair (M, z), it has made all queries necessary to evaluate the IMC- 
construction on input M (with output z). In other words, the interaction graph 
G(a ) of the final history a contains a valid complete path with preimage M and 
value But because the query z was previously issued to £, if A wins the game, 
one of the following has to occur: (i) The subgraph of the valid paths is not a 
directed tree, (ii) No valid path with value z existed when the £-query z was 
issued, but such a path was created afterwards, or (iii) There exist at least two 
valid paths with value 2 . We show that these events are unlikely. 

A key step is proving that, with very high probability, valid paths are con- 
structed only by means of forward queries: A construction of a valid path by 
backward queries may be successful either because we can “connect” the path 
with an already existing one (built by forward queries), or because we construct 
the entire path backwards. However, both cases turn out to be unlikely: In the 
former case, a fresh backward query outputs a random m (under the permutation 
property), and this can only be the next-label for an already existing edge with 
low probability. (This motivates the one- block- lookahead strategy in IM.) In the 
latter case, it is very unlikely to have all of the first n/2 bits returned by the 
last evaluation query being equal to 0. (This motivates the padding in the first 
block.) However, if a path is generated only by forward queries, we can ensure 
that the value of a valid path is always sufficiently random due to the prefix- 
balancedness of H. We refer the reader to the full version for a formalization of 
this argument. 

This highlights a very intriguing property of the IM-construction: Although 
it can be efficiently inverted on any valid output, it is very unlikely that we can 
come up with such a valid output without first evaluating the construction. (In 
particular, this prevents that even a known collision for H will lead to a valid 
collision for the IMC-construction.) 

4 A Length-Preserving Mixing Stage: Random 
Permutation Oracles 

Post-processing the output of the IMC-construction with a random injective 
oracle yields a full-fledged random oracle (by Theorem 0 and Lemma 0) , whose 
collision resistance can be reduced to the one of the underlying function H in the 
standard model by Lemmad The use of the T E-construction | 23 | for this task is 
subject to two main drawbacks: It requires a trapdoor one-way permutation and 
also enlarges the output of the compressing stage. (The lack of online evaluation 
capabilities is not a restriction, as we have to process only inputs of fixed length 
equal the output length of the underlying hash function.) In this section, we 
solve both issues. We present a block-cipher based construction of a fixed input- 
length length-preserving RIO, i.e., a (non-invertible) random permutation oracle 

10 In particular, if A asks a forward query E {k, x) which is answered by y, the matching 

backward query E - 1 (fc, y) is never issued. (And vice versa.) 
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(RPO), that only relies on a one-way permutation without a trapdoor. In the 
full version, we show that this assumption is somewhat minimal, as RIOs/RPOs 
cannot be built from an ideal primitive and a one-way function. 

Additionally, in order to reduce the dependence between the underlying block- 
and hash sizes, we present domain/range extenders for RPOs. 

4.1 Making Block-Ciphers Non-invertible: The NIRP-Construction 

Description. The NIRP-construction combines a permutation 7 r : {0,1}" — > 
{0,1}" and two (fixed-key) ciphers E\,E 2 : {0,1}" — > {0,1}" in a “sandwich- 
like” manner. More precisely, for any input m € {0, 1}" the NIRP-constructions 
is defined such that NIRP Bl,B2,7r (m) := E 2 (7r(E 1 (m))). (Also cf. Figured) Ob- 
viously, NIRP Bl,£ ' 2,7r is a permutation. 

Security of NIRP. We show that the NIRP-construction is indifferentiable 
from a (non-invertible) random permutation oracle if instantiated with two ideal 
single-kejO block ciphers Ei,E 2 and a one-way permutation n ( without a trap- 
door). The result is summarized by the following theorem. 

Theorem 2. Let Ei, E 2 : {0, 1}" — » {0, 1}" be two independent fixed-key ideal 
ciphers and let 1 r : {0, 1}" — > {0, 1}" be a permutation. There exists a simulator 
S (given in the proof) such that for all distinguisher V issuing at most q queries 
to the NIRP -construction, and at most q a ,Qb,Qc,Qd queries to Ei,EJ~ 1 ,E 2 ,E ; } 1 , 
respectively, there exists an owp -adversary A with 

Advjj^pE! , e 2 s (T>) < 2 • q c (2q + q a ) ■ 2 _ " -j» q d • Adv° wp (M). 

The simulator S runs in time 0(q a + qb + q c + Qd + (2 q a + qb + 2 qd) • time(7r)) 
and makes q a + 2 qb + 2 q c queries to its oracle, whereas the adversary A runs in 
time time (A) < time(P) + time(«S). 

Outline of the Proof. The first part of the indifferentiability proof de- 
scribes the simulator S p that mimics the ideal ciphers E-| , E 2 (with their inverses 
E^E^ 1 ) given access to a RPO P : {0,1}" — > {0,1}". Moreover we use the 
notation <S P = (iS Ei ,iS E2 ,iS e -i,iS e -i) to make the four sub-oracles of the sim- 
ulator (answering the different query types) explicit. The second part (which is 
postponed to the full version) upper bounds Vs advantage Advj^pEj.Ea,*- S (T>) 
in distinguishing the ideal setting (with a simulator) and the real setting. 

The Simulator. The global state of the simulator S p consists of a table T 
(which is initially empty) of tuples of the form (a, 6, c, d) consistent with evalu- 
ations of the NIRP-construction as in Figure El that is, where a, b are simulated 
input-output values of the first cipher Ei, i.e., Ei(a) = b (which can be generated 
both by forward queries to Ei and by backward queries to Ej" 1 ) and analogously 

11 Recall that in the ideal cipher model, it is easy to derive two such ciphers from a 
single ideal cipher E : {0, l} K x{0, 1}" — ► {0, 1}" as Ei := E(fci, ■) and E 2 := E(/c 2 , •) 
for two arbitrary distinct keys ki k 2 - 
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c, d play the same role for the second block cipher E 2 . Furthermore, the invariant 
c = n(b) and P(a) = d holds. It is also convenient to define A C {0, 1}" as the 
set of values a £ {0,1}" such that (a. b, c, d) £ T for some b,c,d. Analogously, 
we define the sets B, C, and D. 

To achieve perfect simulation given oracle access to P, upon a new query to 
one of its four sub-oracles the simulator defines a new tuple (a, b, c, d) in T, with 
the input of the query placed at the appropriate position (as long as no such 
tuple already exists, in which case the corresponding output value is returned), 
and such that all remaining components are set to independent random values 
conditioned on these individual values appearing in no other tuple, on d = P(a), 
and on c = 7 r(6). This is easily achievable with access to 7r _1 and P 1 : For 
example, on input a (to SeJ, we choose a random b {0, 1}"\ B (i.e., different 
from all b' appearing in some other tuple), and set c := n(b) and d := P(a). (This 
is done analogously on input b.) On the other hand, on input c, we compute 
b := 7T — 1 (c) , a random a <— {0, 1}" \ A (i.e., different from all previous a'), and 
then set d := P(a). Finally, on input d, we set a := P -1 (d) and subsequently 
generate a random b <— {0, 1}" \ B and set c := 7r(6). 

However, in our setting we have to dispense with 7r _1 and P -1 . In particular, 
this means that in the latter two cases the simulator cannot set the values b 
and a, respectively, but rather sets these components to a dummy value _L, and 
completes these tuples with the actual values if they eventually appear as inputs 
of Ei or E)~ 1 queries. Also note that the simulator must not generate random 
values a and b that collide with a dummy value in order to ensure the permutation 
property. This can be efficiently avoided by simply testing that P(a) 7^ d (and 
ir(b) c ) for all d’s in tuples of the form (T, b, c. d) (all c’s in tuples of the 
form (a. _L, c, d)), and whenever the test fails, we replace the dummy value by 
the actual value, and draw a new a (or b). There are only two remaining cases 
where the simulator fails to answer queries (and aborts): 

(i) A query a is made and a tuple (a, _L, c, d) exists: In this case the simulator 
must return 7r _1 (c), but this requires inverting 7r, which is generally not 
feasible. (Call this event Aborti.) 

(ii) A query b is made and a tuple (_L, 6, c, d) exists: In this case, the simulator 
must return P -1 (d), but cannot invert P. (Call this event Aborts.) 

By the above discussion, perfect simulation is achieved until one of these events 
occurs: A game-based argument yields Adv'^'pE 1 , ,e 2 ^ s (T>) < P[Aborti] + 

P [Aborts]. In the full version we give a complete pseudo-code description of the 
simulator and show that the probabilities of both events are very small. 

NIRP = MCM with Invertible Mixing Steps? Our NIRP-construction 
somehow reflects the MCM design with a permutation, instead of a hash 
function, and this may suggest that the MCM approach works for invertible 
mixing steps as well. Yet, we remark that the proof cannot be adapted to the 
case where the first mixing stage processes inputs of variable input-length: The 
problem is that in the simulation of queries to E 2 and E.J 1 we need to choose 
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Fig. 2. Left: The NIRP-construction for underlying fixed-key block ciphers E 1 .E 2 , with 
(a, b, c, d) corresponding to the notation used in the simulator of Theorem El Right: The 
ESS-construction for underlying permutations P\, ... ,Pq : {0, l} n — > {0, l} n . 


a pair a, P(a) and b. n(b) respectively, and at a later time possibly learn the 
missing dummy values b and c when they are queried. But in order for this to 
succeed, we need the length of a and b to be compatible with the one of such 
later query, which is of course impossible in the variable-input-length case. 

4.2 Extension of Random Permutation Oracles 

The use of the NIRP-construction to post-process the output of a hash function H 
requires a block cipher with block size at least as large as its hash size, i.e. , typically 
at least 160 bits. While block ciphers with large block size exist 0 ciphers such as 
AES support only rather small block lengths, such as 128 bits. This motivates the 
following natural question: Given a RPO P : {0, 1}" — > {0, 1}”, can we devise a 
construction C p : {0, l} m — > {0, l} m for m> n which implements a permutation 
and is indifferentiable from a RPO? Note that this calls for simultaneous domain 
and range extension of P, while we additionally want to ensure injectivity of the 
resulting construction. The problem is similar in spirit to the one considered in the 
private-key setting by Halevi and Rogaway [03, even though the peculiarities of 
the public setting make constructions far more challenging^ 

The ESS-Construction. We present a construction - called ESS - for the case 
m = 2n that relies on six permutations Pi, . . . , Pq : {0, 1}" — *• {0, 1}" and is remi- 
niscent of the compression function SS Pl,P2,P3 : {0, l} 2 ” — > {0, 1}" by Shrimpton 
and Stam j2S| such that SS Pl,P2,P3 (mi||m 2 ) := Ps{Pi{m\) ® P 2 (m 2 )) ®Pi(toi): 
It adds three extra calls (as depicted in Figure EJ) to ensure both indifferentia- 
bility of the 2n-bit output, as well as invertibility. It is indeed not hard to verify 
that ESS implements a permutation: Given output yi\\y 2 , the first input-half 
mi is retrieved by computing 2 := P 6 r 1 (y 2 ), mi := z(B P 5 _1 (a/i), and finally we 
compute m 2 := P 2 ~ 1 (Pi(mi) ® P 3 “ 1 (Pi(mi) ® P 4 ^ 1 (z))). (Of course, the inverses 
P” 1 are not efficiently computable in general, but they are well-defined.) 


12 Interestingly, such block ciphers are exactly the ones used within hash functions, e.g., 
to instantiate the Davies-Mayer construction. 

13 In particular, each such extender implies the construction of a compression function 
{0, l} m — > {0, l} e for all £ < m from length-preserving random oracles which is indif- 
ferentiable from a random oracle from m bits to £ bits, a problem which has recently 
received much interest (cf. e.g. I2i)l2h| h On top of this, injectivity is an extra design 
challenge. 
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Indifferentiability of ESS. The following theorem shows that whenever 
the underlying permutations are independent RPOs, the ESS-construction is 
indifferentiable from a RPO up to the birthday barrier. 

Theorem 3. Let Pi, . . . , Pg : {0, 1}" — ► {0, 1}" be independent RPOs. There 
exists a simulator S such that for all distinguishers V making at most q queries 
to the ESS-construction and to each of the underlying RPOs, we have 

Adv 1 E n s d s -p i i°....P 6 iS (T>) < [q 2 • (4 n 2 + n + 28) + q ■ (3n + 13)] • 2"". 

The simulator S runs in time 0(q 2 ) and makes q queries. 

Arbitrary Extension. A generalization of ESS- called MD-ESS - to construct 
a RPO {0, 1}*'" — * {0, 1}*'" for i > 2 using 4 + i independent RPOs from n 
bits to n bits and making 4* + 1 RPO evaluations in total can be obtained as 
follows: Let MD-SS Pl,P2 ’ P3 : {0,1}"'* — > {0,1}" be the (plain) Merkle-Damgard 
iteration (with no strengthening) that on input M = mi || . . . \\rrii computes 
Vj := SS Pl,P 2 ’ P 3 (vj-i\\nij) for j = \ .... , i (with vq being the IV), and outputs 
Then, on input M = m\\\ . . . ||m* 1 {0,1}"'*, MD-ESS first computes y := 
P 4 (MD-SS Pl ’ P2 ’ P3 (M)), and finally outputs 

(P*+i(y) 0 mi)|| • • • ||(P 4M i(y) © m»— i)||P 4 -m(j/)- 

To verify that MD-ESS implements a permutation, we remark that its output 
uniquely determines y and mi , ... , m;- 1 , whereas m,; is determined by the chain- 
ing value Vi-i and Pff 1 (y) as in the ESS-construction. Its security is shown in 
the full version. There, we also show that P4+1 , . . . , Pi+i-i (but not P 4 +i) can be 
replaced by (invertible) single-key (ideal) ciphers. Also, it can easily be modified 
to support inputs with lengths n' >n which are not multiples of n. 

5 Conclusions 

In this paper, we have shown the first modular and fault-tolerant hash function 
construction which achieves both collision resistance in the standard model and 
indifferentiability in the ideal model. In particular, this was achieved by building 
appropriate mixing steps IM and NIRP that are compatible with the MCM- 
construction and preserve the practical features of the inner compressing part, 
i.e., the hash function H . By Lemma El the construction MCM im,P;NIRP (where 
possibly NIRP is replaced by its extension through one of the constructions 
presented in Section n~2|) inherits the collision resistance of H, as IM and NIRP 
are injective functions. In the ideal setting, we have shown that the combination 
of IM and H is preimage aware as long as H is sufficiently balanced (Theorem El, 
and that NIRP is indifferentiable from a random permutation oracle (TheoremEJ). 
Thus, by applying Lemma El we conclude that MCM im,p nirp is indifferentiable 
from a variable-input-length random oracle. 

While the I M-construction is very practical, the implementation of the NIRP- 
construction, despite its efficiency, is conditioned on the existence of a one-way 
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permutation with input length equal the one of existing block ciphers. Indeed, 
sufficiently-secure candidate one-to-one functions exist for similar input param- 
eters (e.g., the discrete logarithm problem in properly chosen elliptic curves of 
prime order q m 2" can in general not be solved better than with running time 
roughly 0( 2"/ 2 ), i.e., the security of our constructions), but the fact that the 
block cipher expects n-bit inputs makes their use difficult O However, we stress 
that such data-type conversion problems are common in practical constructions. 
For instance, when using an RSA-based trapdoor one-way permutation, the out- 
put of the T E-construction m must be (injectively) transformed into a string, 
and the result may be far from being random (attempting to extract random 
bits would destroy the injectivity property). It is our strong belief that these re- 
sults should foster further research in designing good candidates for such central 
cryptographic primitives working at the bit level. 

Acknowledgments. We thank Marc Fischlin, Ueli Maurer, Thomas Risten- 
part, and the anonymous reviewers for valuable comments. Anja Lehmann is 
supported by the Emmy Noether Program Fi 940/2-1 of the German Research 
Foundation (DFG). Stefano Tessaro is supported by the Swiss National Science 
Foundation (SNF), project no. 200020-113700/1. The work described in this pa- 
per has been supported in part by the European Commission through the ICT 
program under contract ICT-2007-216676 ECRYPT II. 

References 

1. Andreeva, E., Neven, G., Preneel, B., Shrimpton, T.: Seven-property-preserving it- 
erated hashing: ROX. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, 
pp. 130-146. Springer, Heidelberg (2007) 

2. Bellare, M., Ristenpart, T.: Multi-property preserving hash domain extensions 
and the EMD transform. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006. LNCS, 
vol. 4284, pp. 299-314. Springer, Heidelberg (2006) 

3. Bellare, M., Rogaway, P.: Random oracles are practical: A paradigm for designing 
efficient protocols. In: ACM CCS 1993. ACM Press, New York (1993) 

4. Bellare, M., Rogaway, P.: Optimal asymmetric encryption — how to encrypt with 
RSA. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 92-111. 
Springer, Heidelberg (1995) 

5. Bellare, M., Rogaway, P.: The exact security of digital signatures — how to sign 
with RSA and Rabin. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, 
pp. 399-416. Springer, Heidelberg (1996) 

6. Bertoni, G., Daemen, J., Peeters, M., Assche, G.V.: On the indifferentiability of the 
sponge construction. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, 
pp. 181-197. Springer, Heidelberg (2008) 

7. Canetti, R., Dodis, Y., Pass, R., Waffish, S.: Universally composable security 
with global setup. In: Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp. 61- 
85. Springer, Heidelberg (2007) 

14 A natural candidate operating on n-bit strings arises from exponentiation in the 
multiplicative group of the extension field GF( 2 n ): However, due to the existence of 
non-generic attacks, n has to be chosen large enough, i.e., around the same size as 
a reasonably secure RSA-modulo. 


A Modular Design for Hash Functions 381 


8. Canetti, R., Goldreich, O., Halevi, S.: The random oracle methodology, revisited. 
In: STOC 1998, pp. 209-218. ACM Press, New York (1998) 

9. Contini, S., Lenstra, A.K., Steinfeld, R.: VSH, an efficient and provable collision- 
resistant hash function. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, 
vol. 4004, pp. 165-182. Springer, Heidelberg (2006) 

10. Coron, J.-S., Dodis, Y., Malinaud, C., Puniya, P.: Merkle-Damgard revisited: How 
to construct a hash function. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, 
pp. 430-448. Springer, Heidelberg (2005) 

11. Damgard, I.: A design principle for hash functions. In: Brassard, G. (ed.) CRYPTO 
1989. LNCS, vol. 435, pp. 416-427. Springer, Heidelberg (1990) 

12. Dodis, Y., Ristenpart, T., Shrimpton, T.: Salvaging merkle- damgard for practical 
applications. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, vol. 5479, pp. 371-388. 
Springer, Heidelberg (2009) 

13. Fischlin, M., Lehmann, A.: Multi-property preserving combiners for hash functions. 
In: Canetti, R. (ed.) TCC 2008. LNCS, vol. 4948, pp. 375-392. Springer, Heidelberg 
(2008) 

14. Fischlin, M., Lehmann, A., Pietrzak, K.: Robust multi-property combiners for hash 
functions revisited. In: Aceto, L., Damgard, I., Goldberg, L.A., Halldorsson, M.M., 
Ingolfsdottir, A., Walukiewicz, I. (eds.) ICALP 2008, Part II. LNCS, vol. 5126, pp. 
655-666. Springer, Heidelberg (2008) 

15. Fujisaki, E., Okamoto, T., Pointcheval, D., Stern, J.: RSA-OAEP is secure under 
the rsa assumption. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, p. 260. 
Springer, Heidelberg (2001) 

16. Halevi, S., Rogaway, P.: A tweakable enciphering mode. In: Boneh, D. (ed.) 
CRYPTO 2003. LNCS, vol. 2729, pp. 482-499. Springer, Heidelberg (2003) 

17. NIST SHA-3 Competition, http://csrc.nist.gov/groups/ST/hash/sha-3/ 

18. Lyubashevsky, V., Micciancio, D., Peikert, C., Rosen, A.: SWIFFT: A modest 
proposal for FFT hashing. In: Nyberg, K. (ed.) FSE 2008. LNCS, vol. 5086, pp. 
54-72. Springer, Heidelberg (2008) 

19. Maurer, U., Renner, R., Holenstein, C.: Indifferentiability, impossibility results on 
reductions, and applications to the random oracle methodology. In: Naor, M. (ed.) 
TCC 2004. LNCS, vol. 2951, pp. 21-39. Springer, Heidelberg (2004) 

20. Maurer, U., Tessaro, S.: Domain extension of public random functions: Beyond 
the birthday barrier. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 
187-204. Springer, Heidelberg (2007) 

21. Merkle, R.: One way hash functions and DES. In: Brassard, G. (ed.) CRYPTO 
1989. LNCS, vol. 435, pp. 428-446. Springer, Heidelberg (1990) 

22. Pass, R. : On deniability in the common reference string and random oracle model. In: 
Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 316-337. Springer, Heidelberg 
(2003) 

23. Ristenpart, T., Shrimpton, T.: How to build a hash function from any collision- 
resistant function. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 
147-163. Springer, Heidelberg (2007) 

24. Rogaway, P.: Formalizing human ignorance. In: Nguyen, P.Q. (ed.) VIETCRYPT 
2006. LNCS, vol. 4341, pp. 211-228. Springer, Heidelberg (2006) 

25. Shrimpton, T., Stam, M.: Building a collision-resistant compression function 
from non-compressing primitives. In: Aceto, L., Damgard, I., Goldberg, L.A., 
Halldorsson, M.M., Ingolfsdottir, A., Walukiewicz, I. (eds.) ICALP 2008, Part II. 
LNCS, vol. 5126, pp. 643-654. Springer, Heidelberg (2008) 


How to Confirm Cryptosystems Security: 
The Original Merkle-Damgard Is Still Alive! 


Yusuke Naito 1 , Kazuki Yoneyama 2 , Lei Wang 3 , and Kazuo Ohta 3 


1 Mitsubishi Electric Corporation 
2 NTT Corporation 

3 The University of Electro-Communications 


Abstract. At Crypto 2005, Coron et al. showed that Merkle-Damgard 
hash function (MDHF) with a fixed input length random oracle is not 
indifferentiable from a random oracle RO due to the extension attack. 
Namely MDHF does not behave like RO. This result implies that there 
exists some cryptosystem secure in the RO model but insecure under 
MDHF. However, this does not imply that no cryptosystem is secure 
under MDHF. This fact motivates us to establish a criteria methodology 
for confirming cryptosystems security under MDHF. 

In this paper, we confirm cryptosystems security by using the following 
approach: 

1. Find a variant, RO, of RO which leaks the information needed to 

realize the extension attack. 

2. Prove that MDHF is indifferentiable from RO. 

3. Prove cryptosystems security in the RO model. 

From the indifferentiability framework, a cryptosystem secure in the RO 
model is also secure under MDHF. Thus we concentrate on finding RO, 
which is weaker than RO. 

We propose the Traceable Random Oracle (TRO) which leaks enough 
information to permit the extension attack. By using TRO, we can easily 
confirm the security of OAEP and variants of OAEP. However, there are 
several practical cryptosystems whose security cannot be confirmed by 
TRO (e.g. RSA-KEM). This is because TRO leaks information that is 
irrelevant to the extension attack. Therefore, we propose another RO, 
the Extension Attack Simulatable Random Oracle, ERO, that leaks just 
the information needed for the extension attack. Fortunately, ERO is 
necessary and sufficient to confirm the security of cryptosystems under 
MDHF. This means that the security of any cryptosystem under MDHF 
is equivalent to that under the ERO model. We prove that RSA-KEM is 
secure in the ERO model. 

Keywords: Indifferentiability, Merkle-Damgard hash function, Variants 
of Random Oracle, Cryptosystems Security. 

1 Introduction 

Indifferentiability Framework. Maurer et al. 0 introduced the indifferen- 
tiable framework as a notion stronger than indistinguishability. This framework 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 382 1 398 J 2009. 

© International Association for Cryptologic Research 200sF^ 1 


How to Confirm Cryptosystems Security 383 


deals with the security of two systems C(V) and C (U) : for cryptosystem C, C(V) 
retains at least the same level of provable security of C(U) if primitive V is in- 
differentiable from primitive U , denoted by V C U. This definition will allow us 
to use construction V instead of U in any cryptosystem C and retain the same 
level of provable security due to the indifferentiability framework of Maurer et 
al. 0. We denote “C(V) is at least as secure as C(U)" by C(V) >- C(U). More 
strictly, C(V) >- C(U) holds. This result implies that if cryptosystem 

C is secure in the U model and V VU holds, C is secure in the V model, and 
if U \£_ V holds, there is some cryptosystem that is secure in the U model but 
insecure in the V model. 

Indifferentiability and the MD Construction. While many cryptosystems 
have been proven to be secure in the random oracle (RO) model 0 (e.g. FDH 
0 , OAEP0, RSA-KEMjm, Prefix-MAC[T2! and so on), where RO is modeled 
as a monolithic entity (i.e. a black box working in domain {0, 1}*), in practice 
most instantiations that use a hash function are usually constructed by iterating 
a fixed input length primitive (e.g. a compression function). There are many 
architectures based on iterated hash functions. The most well-known one is the 
Merkle-Damgard (MD) construction jdllOj . A hash function with MD construc- 
tion iterates underlying compression function / : (0, 1}" X (0, 1}* — > {0, 1}" as 
follows. 

MD^(mi, ..., mi) {\mi\ =t,i= 1, ..., Z): 
let yo = IV be some n bit fixed value, 
for i = 1 to l do yi = /(yi-i, ruj) 
return yi 

There is a significant gap between RO and hash functions, since hash func- 
tions are constructed from a small primitive / while RO is a monolithic random 
function. 

Coron et al. 0 made important observations on the cryptosystems that use 
the indifferentiable framework. They introduced the new iterated hash function 
property of indifferentiability from RO. In this framework, the underlying primi- 
tive, G, is a fixed input length random oracle (denoted here as FILRO or h ) or an 
ideal block cipher. We say that hash function H G is indifferentiable from RO if 
there exists simulator S such that no distinguisher can distinguish H G from RO 
(S mimics G). The distinguisher can access RO /H G and S/G: S can access RO. 
A hash function that satisfies this property, H G , behaves like RO. Therefore, 
replacing the RO of any cryptosystem by H G does not destroy its security. 

Coron et al. analyzed the indifferentiability from RO for several specific con- 
structions. For example, they have shown that MD^ is not indifferentiable from 
RO due to the extension attack which uses the following property: The output 
value z' = MD ,l (M||m) can be calculated by c = h(z,m) where z = MD h (M), 
so z' = c. On the other hand, no S can return the output value z' = RO(M||m) 
from query (z, m) where z = RO(M), since no S knows z' from z and m, and z' is 
chosen at random. Therefore, no S can simulate the extension attack. This result 
implies that MD"' does not behave like RO and there exists some cryptosystem 


384 Y. Naito et al. 


that is secure in the RO model but insecure under MD"' due to the indifferen- 
tiability framework. Their solution was to propose several constructions such as 
Prefix-Free MD, chop MD, NMAC and HMAC. Hash functions with these con- 
structions are, under h, indifferentiable from RO. It seems impossible to prove 
that the important original MD cryptosystem is secure. 

MD Construction Dead? The MD construction is among the most important 
foundations of modern cryptosystems [2l5i8| . There are two main reasons: 

1. MD construction is employed by many popular hash functions such as SHA-1 
and SHA-256, and 

2. MD construction is more efficient than other iterated hash functions such as 
Prefix-Free MD, and chop MD. 

Since MD^ [£ RO holds, there is some cryptosystem C* that is secure in the RO 
model but insecure under MD /l . Thus the important question is “can we confirm 
that a given cryptosystem is secure in the RO model and secure under MD' 1 ?” 
There might be several cryptosystems that remain secure when RO is replaced 
by MD . If we can confirm this for many cryptosystems that are widely used, 
the original MD construction remains alive in the indifferentiability framework! 

Our Contribution. Since MD h \£_ RO holds, we modify RO such that MD ft is 
indifferentiable from the modified RO. Then we analyze cryptosystems security 
within the modified RO model. Concretely, we adopt the following approach. 

1. Find a variant RO of RO that leaks enough information such that S can 
simulate the extension attack. 

2. Prove that MD h c RO holds. 

3. Prove the cryptosystem’s security in the RO model. 

Secure cryptosystems in the RO model are also secure under MD ,t due to the 
indifferentiability framework. Therefore, we concentrate on proposing RO that 
can support many applications. 

First we propose Traceable Random Oracle TRO as RO. 

Traceable Random Oracle. Our proposal of TRO is motivated by the following 
points: 

- Applications of TRO hide the outputs of hash functions from adversaries. 
One example is OAEP encryption: Adversaries cannot know the outputs of 
the hash functions that are used for calculating a cipher text, since these 
values are hidden by a random value or a trapdoor one-way permutation. 

— TRO leaks useful information such that S can run the extension attack. 

By considering the above points, it is convenient for S to obtain useful informa- 
tion from value 2 which is the output of RO(M). Thus we define TRO that leaks 
input M on query 2 such that RO(M) = 2 . Since S can obtain value M such that 
0 = RO(M), S can know value z' = RO(M||ro) by using TRO. Therefore, S can 
run the extension attack. We will prove that MD“ c TRO holds (Corollary 0) . 
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Since the hash function outputs for OAEP and variants of OAEP (e.g. OAEP+) 
are hidden, adversaries cannot use TRO effectively. So we can easily confirm that 
these cryptosystems are secure in the TRO model. 

Limitation of TRO. Though TRO can easily confirm the security of many cryp- 
tosystems under MD ft , there are several cryptosystems whose security we can- 
not confirm by TRO. For example, RSA-KEM is insecure in the TRO model 
(Theorem 0 . It is possible that there are cryptosystems that are secure under 
MD" because TRO leaks information beyond that needed to simulate the exten- 
sion attack. The essential information to simulate the extension attack is just 
z' = RO(M||m), but TRO leaks M, which is not essential. 

Our response is to propose Extension Attack Simulatable Random Oracle ERO 
as RO. 

Extension Attack Simulatable Random Oracle. We define ERO that leaks just z' 
(= RO(M| |m)). By using ERO, S can run the extension attack, since S can know 
z' . We will prove that MD h c ERO holds (Theorem 0. We will also prove that 
RSA-KEM is secure in the ERO model (Theorem 0. Therefore, we can confirm 
RSA-KEM security under MD h by using ERO. Fortunately, MD^ is equivalent to 
ERO, since ERO C MD 1 holds (Theorem 0. Namely, any cryptosystem that is 
secure under MD fc is equally secure in the ERO model and vice versa. Therefore, 
ERO is necessary and sufficient to confirm the security of cryptosystems under 
MD ". When we analyze a cryptosystem under MD , all that is needed is to prove 
cryptosystems security in the ERO model. 

TRO v.s. ERO. Since TRO leaks more information than ERO, we will prove 
ERO c TRO. Since ERO has wider applicability, we recommend that ERO be 
used for cryptosystems whose security cannot be proven in the TRO model. 
ERO v.s. RO. Since ERO leaks several bits of information in permitting the 
simulation of the extension attack, RO C ERO and ERO [£ RO explicitly hold. 
As evidence of the separation between RO and ERO, we pick up prefix MAC D2I 
which is secure in the RO model, and prove that prefix MAC is insecure in the 
ERO model (Theorem 0. Since ERO is equivalent to MD ft , prefix MAC is also 
insecure in the MD /l model. 

Leakey Random Oracle. Leaky random oracle LRO was proposed by Yoneyama 
et al. m but with a different motivation. LRO has a function that leaks all 
query-response pairs of RO. In this paper, we will prove that TRO C LRO and 
LRO £ TRO hold. Therefore, all cryptosystems secure in the LRO model are also 
secure in the TRO model and there is some cryptosystem that is insecure in the 
LRO model but secure in the TRO model. Since FDH is secure in LRO model 
iia. FDH is secure under MD^'. Since OAEP is insecure in the LRO model m 
and secure in the TRO model, OAEP is evidence of the separation between LRO 
and TRO. 

Remarks. First we compare LRO, TRO and ERO from the viewpoint of security 
proofs of cryptosystems. LRO, TRO, and ERO consist of RO and the additional 
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oracle (denote LO, TO and EO respectively). Since LO leaks more information to 
adversaries than TO, adversaries that are given LRO have more flexible strategies 
than adversaries given TRO. That is, security proofs in the LRO model are more 
complex than those in the TRO model. The same is true for TRO and ERO. 

Finally, for the security proof of cryptosystem C(MD , 'j we compare the direct 
proof in MD ; ' with the proof via ERO. Since MD has the MD structure, we 
must consider this structure in the direct proof. On the other hand, since ERO 
does not have this structure, we does not need to consider it. For example we 
must consider the events of inner collisions for M D h in the direct proof. However 
this is not necessary for the proof in the ERO model. Moreover, since we can 
reuse existing proofs for the simulation of RO in the security proof in the ERO 
model, we only consider the simulation of EO in the security proof. Therefore, 
the security proof in the ERO model is easier than the direct proof in MD . 
Since ERO = MD^ holds, we can confirm a cryptosystems security under MD /l 
by proving its security in ERO, an easier task than a direct proof. 

Related Works. Recently, Dodis et al. independently proposed a methodology 
to salvage the original and modified MD constructions in many applications [Zj . 
They found two properties: one is preimage awareness (PrA), and the other is 
public-use random oracle (pub-RO). pub-RO is the same as LRO. The approach 
of pub-RO is almost same as our approach of LRO. Dodis et al. pointed out that 
the security of cryptosystems that satisfy the following property can be easily 
proven in the pub-RO model: all inputs of hash functions are public to the ad- 
versaries. Therefore, PSS and the Fiat-Shamir signature scheme, and other, are 
easily proven to be secure in the pub-RO model by using existing proofs in the 
RO model. Since LRO(pub-RO) xf TRO and TRO C LRO(pub-RO) hold, TRO 
and ERO have more applications than LRO(pub-RO) (e.g. OAEP is secure in 
the TRO model but insecure in the pub-RO model). The approach of PrA is 
interesting in that this approach can treat the case where the compression func- 
tion / requirement is relaxed from FI LRO to property PrA. It seems, however, 
that this approach is not effective in saving the original MD construction, since 
this approach modifies MD construction by processing the output of the MD 
construction by FI LRO. 

Cryptosystems Security under the Merkle-Damgard Hash Function. 

PSS, Fiat-Shamir, and so on are secure under MD^ thanks to pub-RO jZj, OAEP 
and variants of OAEP are secure under MD h thanks to TRO, and RSA-KEM is 
secure under MD ft thanks to ERO. Since many cryptosystems are secure under 
MD ft , the original Merkle-Damgard construction is still alive! 

2 Preliminaries 

2.1 Merkle-Damgard Construction 

We first give a short description of the Merkle-Damgard (MD) construction. 
Function MD-^ : {0,1}* — > {0,1}” is built by iterating compression function 
/ : {0, 1}" X {0, 1}* — > {0, 1}” as follows. 
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- MD / (M): 

1. calculate M' = pad(M) where pad is a padding function such that pad : 

{ 0 , 1 }* — ({ 0 , 1 }*)*. 

2. calculate Cj = /(cj_i,raj) for i = 1, ...,l where for i — 1, \rrii\ = t, 

M' = and Co is an initial value (s.t. |co| = n). 

3. return c„ 

In this paper we ignore the above padding function, this does not degrade gener- 
ality, so hereafter we discuss MD' : ({0, 1}*)* — > {0, 1}”. We use random oracle 
compression function h as f where h : {0, 1}" x {0, 1}* — ► {0, 1}". Thus we 
discuss below hash function M D h with MD construction using h. 

2.2 Random Oracle 

RO : {0, 1}* — *■ {0, 1}" can be realized as follows. RO has initially the empty 
hash list £ro- On query M, if 3(M, z) G £ro, it returns 0 . Otherwise, it chooses 
z G {0,1}" at random, adds ( M,z ) to the £ro, hereafter denoted by Tro <— 
(M,z), and returns z. 

2.3 Leaky Random Oracle 

LRO was proposed by Yoneyama et al. d. LRO can be realized as follows. LRO 
consists of RO and LO. On a leak query to LO, LO outputs the entire contents 
of jCro- We can define S that can simulate the extension attack by using LRO, 
since S can know M from 2 by using LO and can know z' by posing M\\m to 

RO. 


2.4 Indifferentiability 

The indifferentiability framework generalizes the fundamental concept of the 
indistinguishability of two cryptosystems C(U) and C(V) where C{U) is the cryp- 
tosystem C that invokes the underlying primitive^ and C(V) is the cryptosystem 
C that invokes the underlying primitive V. U and V have two interfaces: pub- 
lic and private interfaces. Adversaries can only access the public interfaces and 
honest parties (e.g. the cryptosystem C) can access only the private interface. 

We denote the private interface of the system W by W pnv and the public 
interface of the system W by W pub . The definition of indifferentiability is as 
follows. 

Definition 1. V is indifferentiable from U, denote V C U, if for any distin- 
guisher D with binary output (0 or 1) there is a polynomial time simulator S 
such that |Pr[D vpnv ’ vpub => i]_p r [£)^ pr ' v >S(w pub ) =*, l] | < e . Simulator S has oracle 
access to U pub and runs in time at most ts ■ Distinguisher D runs in time at most 
to and makes at most q queries, e is negligible in security parameter k. 

This definition will allow us to use construction V instead of U in any cryptosys- 
tem C and retain the same level of provable security due to the indifferentiability 
theory of Maurer et al. 0 . We denote “C(V) is at least as secure as C(U)” by 
C(V) >- C(U). Namely, C(V) >- C{U) denotes the case that if CiU) is secure, then 
C(V) is secure. More strictly, VcM» C(V) >- CiU) holds. 
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2.5 Extension Attack 

Coron et al. showed that MD / ' is not indifferentiable from RO due to the extension 
attack. The extension attack targets MD ft where we can calculate a new hash 
value from some hash value. Namely z' = MD ft (M||m) can be calculated from 
only z and m by z' = h(z. m) where z = MD h (M). Note that z' can be calculated 
without using M. The differentiable attack with extension attack is as follows. 
Let O a be MD h or RO and let Of, be h or S. First, a distinguisher poses M to O a 
and gets z from O a . Second, he poses ( z,m ) to Ob and gets c from CV Finally, 
he poses M\\m to O a and gets z' from O a . 

If O a = MD h and Of, = h, then z' = c, while, if O a = RO and Ob = 
S, then z' 7 ^ c. This is because no simulator can obtain the output value of 
RO(M||m) from just (z, rn) and the output value of RO(M||to) is independently 
and randomly defined from c. Therefore, MD ft \/l RO holds. 

3 Variants of Random Oracles 

In this section, we will introduce several variants of random oracles in order for S 
to simulate the extension attack described above, and then show the relationships 
among these oracles within the indifferentiability framework. 

3.1 Definition of Variants of Random Oracles 

Traceable Random Oracle: TRO consists of RO and TO. On trace query z, 

1. If there exist pairs such that ( Mi,z ) G £ro (* = 1, ...,n), it returns (Mi, ..., 

M n ). 

2. Otherwise, it returns JL. 

We can define S that can simulate the extension attack by using TRO, since S 
can know M from z by using TO and can know z' by posing M||m to RO. 

Extension Attack Simulatable Random Oracle: TRO leaks too much in- 
formation to simulate the extension attack. So we define ERO such that S is given 
just the important information. The important information is value z' such that 
z' = RO(M||to). Therefore, we define ERO as follows. ERO consists of RO and 
EO. EO has initially the empty list £eo and can look into £ro- On simulation 
query (m, z) to EO where \m\ = t, 

1. If ( m,z,z ' ) G £eo, it returns z' . 

2. Else ii z = IV, EO poses query m to RO, receives z' , Ceo (m, 2 , z'), and 
returns z' . 

3. Else if there exists only one pair (M, z) G £ro, EO poses query M||m to RO, 
receives z', Ceq (m,z,z r ), and returns z' . 

4. Else EO chooses z' G {0, 1}" at random, £ E o (m, z.z') and returns z' . 

We can construct S that can simulate the extension attack by using ERO, since 
S can obtain z' from (to, z) where z' = RO(M||m) by using EO. 
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3.2 Relationships among LRO, TRO, ERO, and RO Models within 
the Indifferentiability Framework 

LRO leaks more information of £ro than TRO, and TRO leaks more information 
of £ro than ERO. Therefore, it seems reasonable to suppose that anything secure 
in the LRO model is also secure in the TRO model, anything secure in the TRO 
model is also secure in the ERO model, and any cryptosystem secure in the ERO 
model is also secure in the RO model. We prove the validity of these suppositions 
by using the indifferentiability framework. 

First we clarify the relationship between TRO and LRO. 

Theorem 1. TRO IT LRO and LRO xf TRO. 

Proof. We construct S which simulates TO by using LRO as follows. Given query 
z, S poses a leak query to LO and receives the entire information of £ro- If 
there exists pairs such that ( M i: z ) G £ro (i = 1, it returns (Mi, ...,M„). 

Otherwise it returns _L. 

It is easy to see that \Pr[D RO ’ TO => 1 } - p r [pRO,s(LRO) ^ i]| = o, since the 
output from each step of S is equal to that from each step of TO. 

LRO TRO is trivial, since no S cannot acquire all values in £ro by using 
TRO. □ 

Since TRO C LRO, any cryptosystem secure in the LRO model is also secure in 
the TRO model by the indifferentiability framework. Since LRO xf. TRO, there 
exists some cryptosystem that is secure in the TRO model but insecure in the 
LRO model. For example, Yoneyama et al. proved that OAEP is insecure in the 
LRO model fTTT| . Since OAEP is secure in the TRO model, OAEP is evidence of 
the separation between LRO and TRO. 

Next we will clarify the relationship between ERO and TRO. 

Theorem 2. ERO C TRO and TRO xf. ERO. 

Proof. We construct S which simulates EO by using TRO as follows. S initially 
has the empty list Cs- On query (m, z), if 3 (m, z') G Cs, it returns z' . Other- 

wise S poses query z to TO, and receives string X . If A consists of one value, it 
poses query X\\ m to RO, receives z' . Cs «— ( m,z,z ') and returns z' . Otherwise, 
it chooses z' G {0, 1}" at random, Cs <— (m, z. z') and returns z' . 

It is easy to see that \Pr[D RO ’ EO =► 1] - p^pRO.scrRO) i]| = 0 , since the 
output from each step of S is equal to that from each step of EO. 

TRO xf ERO is trivial, since no S cannot decide whether there exists (M, z) 
in £ro or not by using ERO. □ 

Since ERO C TRO, any cryptosystem secure in the TRO model is also secure in 
the ERO model in the indifferentiability framework. Since TRO Xf ERO, there 
exists some cryptosystem that is secure in the ERO model but insecure in the 
TRO model. We will prove that RSA-KEM is secure in the ERO model but 
insecure in the TRO model in Sectional Therefore, RSA-KEM is evidence of the 
separation between TRO and ERO. 

Finally we will clarify the relationship between RO and ERO. 
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Theorem 3. RO C ERO and ERO RO. 

This proof of theorem 0 is trivial because ERO consists of RO and the addi- 
tional oracle EO which leaks some information of £ro- Since RO d ERO, any 
cryptosystem secure in the ERO model is also secure in the RO model by the 
indifferentiability framework. Since ERO RO, there exists some cryptosystem 
which is secure in the RO model but insecure in the ERO model. We can show 
simple evidence of the separation between ERO and RO as follows: We consider 
the following Prefix-MAC protocol which is unforgeable in the RO model. Note 
that the concept of unforgeability with regard to MAC schemes is defined in p . 

Prefix MAC jl2j : Alice and Bob share one secret key, K, as an authentication 
key. Before sending message M to Bob, Alice sends K \ \M to RO H to obtain 
a MAC value denoted as y. Finally, Alice sends (M, y) to Bob. When Bob 
obtains ( M , y), he sends K\\M to H to obtain another MAC value y' . If y' is 
equal to y, then Bob is convinced that message M is from Alice. Otherwise, 
Bob will reject message M. 

We will show that Prefix MAC fails to satisfy unforgeability for MAC schemes 
in the ERO model. 

Theorem 4 (Insecurity of Prefix MAC in the ERO model). Prefix MAC 
does not satisfy unforgeability for MAC schemes where H is modeled as ERO. 

Proof. A forgery procedure is as follows: forger T obtains a valid pair of (M, h) 
from MAC, where h = H(K\\M). T sends ( h,m ) to EO, and obtains h' = 
H(K\\M\\m). Since M\\m is not queried to MAC, T succeeds in Existential 
forgery of known message attack (EF-KMA) attack using ERO H. □ 

Therefore, Prefix-MAC is secure in the RO model but insecure in the ERO model. 
Consequently, Prefix-MAC is evidence of the separation between ERO and RO. 

Prom the above discussions, the following corollary is obtained. 

Corollary 1. RO C ERO C TRO C LRO, and LRO t TRO t ERO t RO. 

4 Relationship between MD h and ERO in the 
Indifferentiability Framework 

In this section we prove that MD"' c ERO and ERO C MD^ hold as follows. In 
theorem 0 we use statements <jh and qh instead of the total number of queries 
q. oh is the total number of message blocks for RO/MD ' and Qh is the total 
number of queries to S /h 

Theorem 5. MD ft C ERO, for any to, with ts = 0(qff) and e < 

4.(<JH+qh) 2 +2(aH+qh) 

2 n 

This proof is given in subsection 14.11 

In theorem 0 we use statements or and qeo instead of the total number of 
queries q. or is the total number of message blocks for RO/MD h and qeo is the 
total number of queries to EO/S 
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Theorem 6. ERO IZ MD , for any to, with ts = 0(<7eo) o,n d e < 

2(o-g+gEo) 2 +(o'g+gEo) 

2 71 

This proof is given in subsection 14.21 

From Theorem 0 and Theorem El ERO is equivalent to MD" 1 in the indifferen- 
tiability framework. From Corollary [0 Theorem El and Theorem EJ the following 
corollary is obtained. 

Corollary 2. RO C MD 1 * = ERO C TRO C LRO, and LRO £ TRO t ERO = 
MD h t RO 

4.1 Proof of Theorem El 

First we define simulator S as follows. S has a list T which is initially empty. We 
define chain triples as follows. 

Definition 2 (Chain Triples). Triples (xi, mi, yi ), ..., (xi, mi, yi) are chain 
triples if x i = IV and yj = Xj+i ( j = 1 — 1) holds. 

Simulator S: On a query (x,m), 

1. If 3(x, m, y) G T, it outputs y. 

2. Else if chain triples 3(aq, mi, yi), ..., ( Xi,mi,yi ) G T such that x = y it y <— 
RO(mi]|...||mt||m). 

3. Else, y *— EO (m,x). 

4. T «— ( x,m,y ). 

5. S returns y. 

Since S needs to search pairs in T, this requires at most 0(q'f L ) time. 

We need to prove that S cannot tell apart two scenarios, ERO and MD /l . In one 
scenario D has oracle access to RO and S while in the other D has access to MD^ 
and h. The proof involves a hybrid argument starting in the ERO scenario, and 
ending in the MD ft scenario through a sequence of mutually indistinguishable 
hybrid games. 

We give six events that allow D to distinguish MD /l from ERO. These events 
arise from the fact that MD ft has the MD construction but ERO does not. We 
explain these events as follows. Details of these events are given in Game 3. 

First we discuss distinguishing events that occur due to differences among RO 
and MD /l . RO and MD h return a random value unless collision occurs. There- 
fore, distinguishing events occur when collision occurs. When a collision of MD /l 
occurs, one of following events occurs due to the MD construction: an output of 
h is equal to IV (event El) or a collision of h occurs (event E2). On the other 
hand, since RO is a monolithic function, these events don’t occur. Therefore, 
these events are distinguishing events between MD /l and ERO. 

Second, we discuss distinguishing events that occur due to differences among 
S and h. Since for h there is the relation that h(x,m ) = RO(M\\m) where 
MD"(M) = x, S must simulate the relation such that S(x,m) = RO(M||m) 
where RO(M) = x. On query (x. rn) to S, if only one pair exists {M. x) G £ro 
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such that x ^ IV holds, S can know MD h (M\\m) by using EO. Therefore, S 
can simulate the relation. If such a pair does not exist ((M,x) £ £ro), since S 
cannot know M, S cannot know the value of RO(M||ra). Therefore, S cannot 
simulate the relation (event E3 and event E5). If two or more such pairs exist 
(M 1 , x ) , • ■ ■ G Tro), S must simulate the relation such that RO(M||m) = 
RO(M , ||m) =,•••• However, since S cannot control the outputs of RO, it cannot 
simulate the relation (event E4). 

On the other hand, if 3 (M, x ) G Pro such that x = IV, S must simulate the 
relation such that RO(m) = RO(M\\m). However, since S cannot control the 
outputs of RO, it cannot simulate the relation (event E6). 

In following game transforms, since the MD construction is considered in 
Game 3 for the first time, we discuss these events in the transform from Game 2 
to Game 3. In this discussion, we show that if distinguishing events don’t occur, 
Game 3 is identical to Game 2, and the probability that one of the events will 
occur is negligible. 

Game 1: This is the random oracle model, where D has oracle access to RO 
and S. Let G1 denote the event that D outputs 1 after interacting with RO and 
S. Thus Pr[Gl] = Pr[-D RO ’S(ERO) = 1], 

Game 2: In this game, we give the distinguisher oracle access to a dummy relay 
algorithm P 0 instead of direct oracle access to RO. Po is given oracle access to 
RO. On query M to Po, it queries M to RO and returns RO(M). Let G2 denote 
the event that D outputs 1 in Game 2. Since the view of D remains unchanged 
in this game, Pr [G2] = Pr [Gl] . 

Game 3: In this game, we modify the relay algorithm Po into Pi as follows. 
For hash oracle query M, Pi applies the MD construction to M by querying S. 
Pi is essentially the same as MD ft except that Pi is based on S instead of the 
fixed input length random oracle h. 

We show that Game 3 is identical with Game 2 unless the following bad events 
occur. In response to query ( x , m), S chooses response y G {0, 1}": 

- El: It is the case that y = IV. 

- E2: There is a triple (x ' , rn ’ , y r ) G T, with (x ' , to 7 ) ^ (x, m), such that y’ = y. 

- E3: There is a triple (x 1 , ml , y') G T, with ( x',m ') ^ (x, m), such that x' = y 
and ( x',m',y ') is defined exept for step 3 of EO. 

and in a response to a query M to RO, RO returns z: 

- E4: There is a pair (M 7 , z') G Pro, with M ^ M’ such that z = z 7 . 

- E5: There is a triple ( x',m',y ') G T such that z = x’ . 

- E6 :z = IV. 

We demonstrate that Game 3 is identical with Game 2 unless bad events occur 
and the probability that bad events occur is negligible. Before we demonstrate 
these facts, we give an useful property as follows. 
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Lemma 1. For any chain triples (x x ,mi,yi ), ..., {xi,mi,yi) in T, yi = 
RO(mi||...||mj) holds unless bad events occur. 

Proof. To contrary, assume that yi 7^ RO(mi||...||m,). Since y,; is defined in step 
2 of S (case A), step 2 of EO (case B), step 3 of EO (case C), or step 4 of EO 
(case D), we show that when yi is defined in each step, bad events occur. 

First, we discuss the case A. In this case, we divided two case: When ( Xi , m,; , yf) 
is stored, another chain triples ..., {x' t .rn' f , y' t ) are already stored in 

T such that y t = yi - 1 (case A-l) and chain triples are not stored in T (case 
A-2). The case A-l is equal to collision of MD S . Therefore a collision of S occurs 
or an output of S is equal to IV in this case. Therefore event El or E2 occurs. 
In the case A-2, since yi = RO(mi||...||mj) holds from the definition of S, this is 
contrary to the assumption. 

We discuss the case B. In this case, we divided two cases: i = 1 (case B-l) and 
i 7^ 1 (case B-2). In the case B-l, y x = RO(mi) holds due to the definition of S. 
This is contrary to the assumption. In the case B-2, since Xi = IV, y,_i = IV 
holds. Therefore event El or E6 occurs. 

We discuss the case C. In this case, ( M,Xi ) is already in £ro, when y,; 
is defined. We consider two cases: M = mi||...||mj_i (case C-l) and M 7^ 
mi||...||TOj_i (case C-2). In the case C-l, y* = RO(mi||...||mj) holds and this 
is contrary to the assumption. In the case C-2, we consider two case: y l -\ is 
chosen at random by EO (case C-2-1) and y*_ 1 is defined by RO (case C-2-2). 
For the case C-2-1, from the definition of S, when (aq_i,TO_ii,y.j_i) is stored in 
T, some triple (xj, n%j, yf) is not in T . Assume that j is the maximum number. 
Therefore yj+i, yi-i are defined at random by EO and independent from RO. 
(xj + i,rrij + i. yj+i) is stored in T before ( Xj,mj,yj ) is stored in T. If yj is defined 
at random by EO and independent from RO, event E3 occurs. If yj is defined by 
RO (yi = RO(mi| ... jrij)), event E5 occurs. The case C-2-2 is equal to event E4. 

Finally we discuss the case D. From the same discussion of the case C-2-1, 
bad event E3 or E5 occurs. □ 

For the view of D for Rq and R x , from Lemma GJ for any M, R x (M) = RO(M) 
holds unless bad events occur. Therefore the view of D for R 0 is equal to that 
for R x . For consistency in Game 2, from the definition of S and Lemma [0 
for any chain triples (aq, mi, yi), ..., (xi, nn, yf) £ T, y* = RO(mi||...||mi) = 
Ro(mi\\...\\rn l ) holds unless bad events occur. Therefore, the answers given by 
S are consistent with those given by Ro . For consistency in Game 3, from 
the definition of S, the definition of R x and Lemma d for any chain triples 
(xi,mi,yi), ...,(xi,mi,yi) £ T, yi = i?i(mi||...||mj) = RO(mi||...||mj) holds 
unless bad events occur. Therefore, the answers given by S are consistent with 
those given by R x . Therefore, Game 3 is identical with Game 2 unless bad events 
occur. 

Next we examine the probability that bad events occur as follows. 

Lemma 2. Pr[El V E2 V E3 V E4 V E5 V E6] < 2g ' + ^ +g 2 1 n g2+gl+g2 where q x is the 
maximum number of invoking the simulator and q -2 is the maximum number of 
invoking RO. 
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Proof. We will examine each of the three events and bound their probability. 
Since outputs of S are chosen at random, Pr [El] < Since E2 is the event 
where a collision occurs, Pr[E2] < 1 — =-—^- ■ ■ ■ 2 ~% 1+1 < §k- Since y is chosen 
at random, the probability that event E3 < Since E4 is the event that a 

RO collision occurs, Pr[E4] < Since E5 is the event that a random value is 
equal to some fixed value, Pr[E5] < Since E6 is the event that a random 
value is equal to IV, Pr [E6] < Therefore Pr [El V E2 V E3 V E4 V E5 V E6] < 
Pr [El] + Pr[E2] + Pr[E3] + Pr[E4] + Pr[E5] + Pr[E6] < 2< A + < &+^+^+^ , D 
Let G3 denote the event that the distinguisher D outputs 1 in Game 3, B2 be the 
event wherein El V E2 V E3 V E4 V E5 V E6 occurs in Game 2 and B3 be the event 
wherein El V E2 V E3 V E4 V E5 V E6 occurs in Game 3. From Lemma |2| the prob- 
ability that bad events occur in Game 2 is less than aa +3qh +3q ^ H +2qh +aH and 
the probability that bad events occur in Game 3 is less than 2 + 2 ( < ' H+ ^’> , 

Therefore |Pr[G3]-Pr[G2]| = |Pr[G3AB3]+Pr[G3A^B3]-Pr[G2AB2]-Pr[G2A 
— iB2] | < | Pr [G3 1 B3] x Pr [B3] - Pr[G2|B2] x Pr [B2]| < max{Pr[B2], Pr [B3]} = 

i(<JH+qh) 2 +2(a H +qh) 

2 n 

Game 4: In this Game, we modify simulator S to Si- RO is removed from 
simulator Si as follows. 

Simulator Si: On query ( x,m ), 

1. If 3(x, m, y) G T, it responds with y. 

2. Else Si chooses y <— {0, 1}" at random. 

3. T^{x,m,y). 

4. Si responds with y. 

The output of S is chosen at random or chosen by RO. Therefore, for any fresh 
query to S, the response is chosen at random. Since RO is invoked only by S, no 
D can access RO. Namely, no D distinguish Si from S, though RO is removed in 
Si, so Game 4 is identical to Game 3. Let G4 denote the event that distinguisher 
D outputs 1 in Game 4. Pr[G4] = Pr[G3] holds. 

Game 5. This is the final game of our argument. Here we finally replace 
Si with the fixed input length random oracle h. Let G5 denote the event that 
distinguisher D outputs 1 in Game 5. Since for a new query Si responds with a 
random value and for a repeated query Si responds a repeated value, Game 5 is 
identical to Game 4. Therefore, we can deduce that Pr [G5] = Pr [G4]. 

Now we can complete the proof of Theorem by combining Games 1 to 5, and 
observing that Game 1 is the same as ERO scenario while Game 5 is same as 
MD ft scenario. Hence we can deduce that e < 4 ( crg +gh ) ^ 2 ^ aH +qh ^ . □ 

4.2 Proof of Theorem 0 

We define simulator S that simulates EO. S has initially empty list Cs ■ On query 
( m,z ), S is defined as follows: z' <— h(z,m), and it returns zJ . The simulator’s 
running time requires at most O(cj'Eo) time. 
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We need to prove that S cannot tell apart two scenarios, MD /l and ERO 
scenarios, one where D has oracle access to MD^ and S and the other where D 
has access to RO and EO. The proof involves a hybrid argument starting in the 
MD ft scenario, and ending in the ERO scenario through a sequence of mutually 
indistinguishable hybrid games. 

Game 1: This is the MD^ scenario, where D has oracle access to MD ft and 
S (h). Let G1 denote the event that D outputs 1 after interacting with MD^ and 
S (h). Thus Pr[Gl] = Pr[D MD ‘^) = 1]. 

Game 2: In this game, we change the underlying primitive of MD from h 
to S. Thus D interacts with MD S and S (h). For any query to S, S poses it 
to h and returns the value received from h. Let G2 denote the event that D 
outputs 1 in Game 2. Since the view of D remains unchanged in this game, so 
Pr[G2] = Pr[Gl]. 

Game 3: In this game, we remove S and h and insert EO and RO. In this 
game, D interacts with MD E0 and EO and does not access to RO. Since for a 
fresh query EO returns a fresh random value and for a repeated query EO returns 
the corresponding value, Game 3 is identical with Game 2. Let G3 denote the 
event that D outputs 1 in Game 3. Since the view of D remains unchanged in 
this game, so Pr[G3] = Pr[G2]. 

Game 4. This is the final game of our argument. In this game, we remove 
MD eo and D interacts with RO and EO. We show that Game 4 is identical with 
Game 3 unless following bad events occur and probability that bad events occur 
is negligible. 

Bad events are as follows. On query (m, x), EO returns y: 

- Badl: y = IV. 

On query M, RO returns z: 

- Bad2: There is a pair ( M z') in £eo> with M ^ M', such that z = z' . 

- Bad3: There is a triple (to, x , y) in £eo such that z = x. 

We demonstrate that Game 4 is identical with Game 3 unless bad events occur 
and the probability that bad events occur is negligible. Before we demonstrate 
these facts, we give an useful property as follows. 

Lemma 3. For any chain triples (aq, mi, j/i), ..., (xi, mi, yt) in Cbo, y% = 
RO(toi||...||toi) holds unless bad events occur. 

Due to lack of space, we omit this proof. We will show this in the full version. 

For the view of D for MD E0 and RO, from Lemma 01 the view of D for 
MD e0 is equal to that for RO. For consistency in Game 3, from the definition of 
MD and Lemma 01 for any chain triples (toi, xi, yi ), ..., ( mi,Xi,yi ) £ £eo> Vi = 
RO(mi||...||mj) = MD E0 (m-| ||...||m,) holds unless bad events occur. Therefore, 
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the answers given by S are consistent with those given by MD E0 . For consistency 
in Game 4, from Lemma 0 for any chain triples (aq , mi, yi), . . . , (xi , m, , y,,) G 
£eo ; Vi = RO(mi||...||mi) holds unless bad events occur. Therefore, the answers 
given by S are consistent with those given by RO. Therefore, Game 4 is identical 
with Game 3 unless bad events occur. 

Next we examine the probability that bad events occur as follows. 

Lemma 4. Pr[Badl V Bad2 V Bad3] < qi+q Ff qiq2 where q\ is the maximum 
number of invoking EO and q 2 is the maximum number of invoking RO. 

Due to lack of space we omit this proof. 

Let G4 denote the event that the distinguisher D outputs 1 in Game 4, B3 
be the event that Badl V Bad2 V Bad3 occurs in Game 3 and B4 be the event 
that Badl V Bad2 V Bad3 occurs in Game 4. Therefore |Pr[G4] — Pr[G3]| < 
max{Pr[B3], Pr[B4]} = 2 (^+^o) 2 2 +(^+g EO ) - D 

4.3 MGF1 Transform 

In the above discussions, we ignored range extension algorithms such as MGF1 
which is an instantiated hash function of OAEP. When we consider these algo- 
rithms, we need to modify TRO and ERO. Due to the lack of space, we only 
modify TRO for MGF1 as follows and will discuss ERO in the full paper. 

Let H : {0,1}* — > {0,1}” be some hash function and MGF1 : {0,1}* — ► 
{0,lp n be H(M\\[l))\\H(M\\[2])\\...\\H(M\\\j)) where M is the input of the 
hash function and [s] is the encoding value of s. We confirm the security of 
cryptosystems that use MGF1 transform with MD^ by the following approach. 
Let MGF1 : {0, 1}* -► {0, lp” 

— Propose the modification of TRO (denote TRO 7 that consists of random 
oracle RO 7 : {0,1}* — ► {0, lp" and TO of RO 7 ) such that MGPl(TRO) IZ 
TRO 7 . 

— Prove cryptosystems security in TRO 7 model. 

If we can find above TRO 7 , since MD" c TRO, cryptosystems that are secure in 
TRO 7 model are secure under MD \ 

TRO 7 is as follows. TRO 7 consists of random oracle RO 7 : {0, 1}* — > {0, lp" 
and TO 7 , a variant of TO. Let ^[s] be the s-th block of 2 . On trace query (j, w) 
to TO 7 , 

— If there exist pairs such that (M, z) € Pro such that z[j] = us, TO 7 returns 
all such pairs. 

— Otherwise, TO 7 returns JL. 

When H is a random oracle, we can see P(*||[l]), .... P(*||[j]) as independent 
random oracles ROi, ..., RO r In order to prove MGF l(TRO) c TRO 7 , we need 
to find a simulator that simulates each TO of ROi, ..., RO r The simulator of TO 
of RO s can be easily shown by using queries (s,*) to TO 7 . Therefore, we can 
prove MGPl(TRO) C TRO 7 . 
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Cryptosystems that are secure in the TRO model are also secure in the TRO 7 
model by discussions similar to those for the cases of TRO. Note that security 
bound of these cryptosystems is dependent on n, not jn. 

The same discussion can be applied to KDF3 which is an instantiated hash 
function of RSA-KEM |1 1 1. 

5 Security Analysis of RSA-KEM in TRO and ERO 
Models 

The RSA-based key encapsulation mechanism (RSA-KEM) scheme DU is a se- 
cure KEM scheme in the RO model. In this section, we consider the security of 
RSA-KEM in the TRO and ERO models. 

The notation of the scheme follows that in P, The security of RSA-KEM in 
the RO model is proved as follows; 

Lemma 5 (Security of RSA-KEM in the RO model jlljb If the RSA 

problem is hard, then RSA-KEM satisfies IND-CCA for KEM where KDF is 
modeled as RO. 

5.1 Insecurity of RSA-KEM in TRO Model 

Though RSA-KEM is secure in the RO model, it is insecure in the TRO model. 
More specifically, we can show that RSA-KEM does not even satisfy IND-CPA 
for KEM in the TRO model. Note that IND-CPA means IND-CCA without DO. 

Theorem 7 (Insecurity of RSA-KEM in the TRO model). Even if the 
RSA problem is hard, RSA-KEM does not satisfy IND-CPA for KEM where 
KDF is modeled as TRO. 

Proof. We construct an adversary, A, which successfully plays the IND-CPA by 
using TRO KDF. The construction of A is as follows; 

Input : (n, e) as the public key 
Output : b' as the guessed bit 

Step 1 : Return state and receive (K£,Cq) as the challenge. Pose the trace 
query K* } to KDF , and obtain {r}. 

Step 2 : For all r in {r}, check whether r e = Cq (mod n). If there is r* that 
satisfies the relation, output b' = 0. Otherwise, output b' = 1. 

We estimate the success probability of A. When challenge ciphertext Cq is gen- 
erated, r* such that Kq = KDF(r*) is certainly posed to KDF because Cq is 
generated following the protocol description. Thus, Ckdf contains (r* , Cq . Kq). 
If (r* , Cq . Kl ) is not in Ckdf , then 6=1. Therefore, A can successfully play 
the IND-CPA game. □ 

5.2 Security of RSA-KEM in ERO Model 

We can also prove the security of RSA-KEM in the ERO model as well as in the 
RO model. 
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Theorem 8 (Security of RSA-KEM in the ERO model). If the RSA prob- 
lem is (t', e')-hard, then RSA-KEM satisfies (t, e)-IND-CCA for KEM as follows: 
t' = t + ( qRKDF + Qekdf ) • expo, e! > e — where KDF is modeled as ERO, 
Qrkdf is the number of hash queries posed to the RO of KDF, qekdf is the 
number of extension attack queries posed to the EO of KDF, qp is the number 
of queries posed to the decryption oracle VO and expo is the running time of 
exponentiation modulo n. 

The proof will be described in the full paper. 
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On the Analysis of Cryptographic Assumptions 
in the Generic Ring Model* 
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Abstract. At Eurocrypt 2009 Aggarwal and Maurer proved that break- 
ing RSA is equivalent to factoring in the generic ring model. This model 
captures algorithms that may exploit the full algebraic structure of the 
ring of integers modulo n, but no properties of the given representation of 
ring elements. This interesting result raises the question how to interpret 
proofs in the generic ring model. For instance, one may be tempted to 
deduce that a proof in the generic model gives some evidence that solving 
the considered problem is also hard in a general model of computation. 
But is this reasonable? 

We prove that computing the Jacobi symbol is equivalent to factoring 
in the generic ring model. Since there are simple and efficient non-generic 
algorithms computing the Jacobi symbol, we show that the generic model 
cannot give any evidence towards the hardness of a computational prob- 
lem. Despite this negative result, we also argue why proofs in the generic 
ring model are still interesting, and show that solving the quadratic 
residuosity and subgroup decision problems is generically equivalent to 
factoring. 


1 Introduction 

The security of asymmetric cryptographic systems relies on assumptions that 
certain computational problems, mostly from number theory and algebra, are 
intractable. Since proving useful lower complexity bounds in a general model of 
computation seems to be impossible with currently available techniques, these 
assumptions have been analyzed in restricted models, see f22l I 7l<Slfj . for instance. 
A natural and very general class of algorithms is considered in the generic ring 
model. This model captures all algorithms solving problems defined over an alge- 
braic ring without exploiting specific properties of a given representation of ring 
elements. Such algorithms work in a similar way for arbitrary representations of 
ring elements, thus are generic. 

Considering fundamental cryptographic problems in the generic model is mo- 
tivated by the following ideas. First, showing that a cryptographic assumption 

* This is an extended abstract, the full version is available on eprint uni. Supported 
by the European Community (FP7/2007-2013), grant ICT-2007-216646 - European 
Network of Excellence in Cryptology II (ECRYPT II). 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 399 [lie] 2009. 
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holds with respect to a restricted but meaningful class of algorithms might indi- 
cate that the idea of basing the security of cryptosystems on this assumption is 
not totally flawed, and may therefore be seen as evidence that the assumption 
is also valid in a general model of computation. Second, showing that a large 
class of algorithms is not able to solve a computational problem efficiently is an 
important insight for the search for cryptanalytic algorithms, and can be used 
to deduce the optimality of certain classes of algorithms. Moreover, the generic 
model is a valuable tool to study the relationship among computational prob- 
lems, such as the equivalence of the discrete logarithm and the Diffie-Hellman 
problem, as done in for instance. 

In this paper we prove a general theorem which states that solving certain 
subset membership problems in the ring Z n is equivalent to factoring n. This 
main theorem allows us to provide an example for a computational problem with 
high cryptographic relevance which is easy to solve in general, but equivalent to 
factoring in the generic model. Concretely, we show that computing the Jacobi 
symbol is equivalent to factoring in the generic ring model. 

For many common idealized models in cryptography it has been shown that 
a cryptographic reduction in the ideal model need not guarantee security in 
the “real world”. Well-known examples are, for instance, the random oracle 
model H3 ■ the ideal cipher model 0, and the generic group model All 

these results have in common that they used somewhat contrived constructions 
that deviate from standard cryptographic practice^ In contrast, our result on the 
generic equivalence of computing the Jacobi symbol and factoring is an example 
for a truly practical computational problem that is provably hard in the generic 
model, but easy to solve in general. This is an important aspect for interpreting 
results in the generic ring model, like |7IKI1 5I2IT] . Thus a proof in the generic 
model is unfortunately not even an indicator that the considered problem is 
indeed useful for cryptographic applications. 

This negative result does not affect the other mentioned motivations for the 
analysis of computational problems in the generic ring model. A lower bound 
in this model allows to deduce the optimality of certain classes of algorithms, 
and gives insight into the relationship between cryptographic problems, which is 
also of interest. Motivated by this fact, we also show that solving the quadratic 
residuosity and subgroup decision problems is generically equivalent to factoring. 
For the latter problem we show that the equivalence holds even in presence of a 
Diffie-Hellman oracle. Thus, a Diffie-Hellman oracle does not help in solving the 
subgroup decision problem. 

By taking a closer look at the construction of the simulator used in the proof 
of our main theorem, we furthermore deduce that for a certain class of compu- 
tational problems there exists an efficient generic ring algorithm if and only if 
there is an efficient straight line program solving the problem. 


An exception is the result of |2I3, showing a (non-generic) attack on a scheme with 
provable security in the generic model. However, m note that this stems not from 
a weakness in the generic model, but from an incorrect security proof. 
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1.1 Related Work 

Previous work considering fundamental cryptographic assumptions in the generic 
model considered primarily discrete logarithm-based problems and the RSA 
problem. Starting with Shoup’s seminal paper E2. it was proven that solv- 
ing the discrete logarithm problem, the Diffie-Hellman problem, and related 
problems |l XII 7121] is hard with respect to generic group algorithms. Damgard 
and Koprowski showed the generic intractability of root extraction in groups of 
hidden order m- 

Brown jH| reduced the problem of factoring integers to solving the low-exponent 
RSA problem with straight line programs , which are a subclass of generic ring 
algorithms. Leander and Rupp m augmented this result to generic ring algo- 
rithms, where the considered algorithms may only perform the operations addi- 
tion, subtraction and multiplication modulo n, but not multiplicative inversion 
operations. Recently, Aggarwal and Maurer p extended this result from low- 
exponent RSA to full RSA and to generic ring algorithms that may also com- 
pute multiplicative inverses. Boneh and Venkatesan |Zj have shown that there is 
no straight line program reducing integer factorization to the low-exponent RSA 
problem, unless factoring integers is easy. 

The notion of generic ring algorithms has also been applied to study the 
relationship between the discrete logarithm and the Diffie-Hellman problem and 
the existence of ring- homomorphic encryption schemes Kill (1121 . 

2 Preliminaries 

2.1 Notation 

For a set A and a probability distribution V on A, we denote with a •2- A the 
action of sampling an element a from A according to distribution V. We denote 
with U the uniform distribution. When sampling k elements a\ , . . . , a*, •2- A, we 
assume that all elements are chosen independently. 

Throughout the paper we let n be the product of at least two different primes, 
and denote with n = n,:_i pT the prime factor decomposition of n such that 
gcd (p?,pf) = 1 for * / j. 

Let P = (Si , . . . , S m ) be a finite sequence. Then |P| denotes the length of P, 
i.e. |P| = m. For k < rn we denote with Py ; the subsequence (Si, . . . , Sk) of P. 
For a sequences P with we write Pfc C P to denote that Pk is a subsequence of 
P such that Pk consists of the first \Pk\ elements of P. 


2.2 Uniform Closure 

By the Chinese Remainder Theorem, for n = Hi=i PT the ring Z n is isomorphic 
to the direct product of rings Z p <-i x • • • x . Let 4> be the isomorphism 
Z p «i x x Z p e k — ► Z n , and for C C Z n let Cj := [y mod y e C} for 
1 < < k. 


402 T. Jager and J. Schwenk 


Definition 1 (Uniform Closure). We say that U [C] C Z„ is the uniform 
closure of C C Z n , if 

U [C\ = {y G Z n | y = (j)(y 1 . . . , y k ), y t G C* for 1 < * < k}. 

In particular note that C C U[C\, but not necessarily U[C] C C. The following 
lemma follows directly from the above definition. 

Lemma 1. Sampling y <— U [C\ uniformly random from U [C] is equivalent to 
sampling yi uniformly and independently from Ci for 1 < i < k and setting 
y = <j)(yi,...,y k ). 

2.3 Straight Line Programs 

A straight line program over a ring R is a generic ring algorithm performing a 
fixed sequence of ring operations, without branching, that outputs an element 
of R. Thus straight line programs are a subclass of generic ring algorithms. 
The following definition is a simple extension of 0 Definition 1] to straight line 
programs that may also compute multiplicative inverses. 

Definition 2 (Straight Line Programs). A straight line program P of length 
m over 1 n is a sequence of tuples 

P= ((il,jl,°l),-" 

where —1 < i k ,j k < k and o ^ g {+, — , /} for i € {1, . . . , m}. The output P{x) 
of straight line program P on input x G Z„ is computed as follows. 

1. Initialize L_i := 1 € Z n and Lq := x. 

2. For k from 1 to m do: 

— if °k = / and Lj k $ Z* then return Jjf 

— else set L k := Lj fe o Lj k . 

3. Return P(x) = L m . 

We say that each triple (i,j,o) G P is a SLP-step. 

For notational convenience, for a given straight line program P we will denote 
with P k the straight line program given by the sequence of the first k elements of 
P, with the additional convention that P-\(x) = 1 and Pq(x) = x for all x G Z n . 

2.4 Generic Ring Algorithms 

Similar to straight line programs, generic ring algorithms perform a sequence 
of ring operations on the input values l,x G Z n . However, while straight line 
programs perform the same fixed sequence on ring operations to any input value, 
generic ring algorithms can decide adaptively which ring operation is performed 
next. The decision is made either based on equality checks, or on coin tosses. 
Moreover, the output of generic ring algorithms is not restricted to ring elements. 
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We formalize the notion of generic ring algorithms in terms of a game between 
an algorithm A and a black-box O, the generic ring oracle. The generic ring 
oracle receives as input a secret value i?Z„. It maintains a sequence P , which 
is set to the empty sequence at the beginning of the game, and implements two 
internal subroutines test() and equal(). 

- The test()-procedure takes a tuple ( j , o) £ {— 1, . . . , |P|} x {+, — , ■, /} as in- 
put. The procedure returns false if o = / and Pj (x) $ Z* , and true otherwise. 

- The equalQ-procedure takes a tuple (i,j) £ {— 1, . . . , |P|} x {—1, . . . , |P|} 
as input. The procedure returns true if Pi{x) = Pj(x) mod n and false 
otherwise. 

In order to perform computations, the algorithm submits SLP-steps to O. 
Whenever the algorithm submits (i,j, o) with o £ {+,—,•,/}, the oracle runs 
test(j, o). If test(j, o) = false, the oracle returns the error symbol _L. Otherwise 
(■ i,j , o) is appended to P. Moreover, the algorithm can query the oracle to check 
for equality of computed ring elements by submitting a query (i,j, o) such that 
o £ {=}. In this case the oracle returns equal(i, j). We measure the complexity 
of A by the number of oracle queries. 

2.5 Some Lemmas on Straight Line Programs over Z„ 

In the following we will state a few lemmas on straight line programs over Z n 
that will be useful for the proof of our main theorem. 

Lemma 2. Suppose there exists a straight line program P such that for x, x' £ 
Z„ holds that P(a/) and P(x) =_L. Then there exists Pj C P such that 
Pj(x’) £ Z* and Pj(x) £ Z* . 

Proof. P( x) mb. means that there exists an SLP-step (i,j, o) £ P such that 
o = / and Lj = P ; (x) £ Z*. However, P(x') does not evaluate to _L, thus it 
must hold that Pj(x') £ Z*. 

The following lemma provides a lower bound on the probability of factoring n 
by evaluating a certain straight line program P with y U [C] and computing 
gcd(n, P(y)), relative to the probability that P(x') 0 Z* and P(x) £ Z* for 
randomly chosen x,x' +-C. 

Lemma 3. For any straight line program P and C C Z n holds that 
Pr [P(x') 0 Z* n and P(x) &Z* n \x,x'£- c] 

< ( ™ ) 2 Pr [g cd (”- p (y)) * { W I y^ u [C]] ■ 

Similar to the above, the following lemma provides a lower bound on the prob- 
ability of factoring n by computing gcd(n, P(y) — Q(y)) with y i—U [C] for two 
given straight line programs P and Q , relative to the probability Pr[(P(a;) =„ 
Q(x) and P(x') ^ n Q(x')) \ x,x’ C\. 
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Lemma 4. For any pair ( P , Q) of straight line programs and C C Z n holds that 


Pr \p{x) = n Q(x) and P(x') £ n Q(x') | x,x' A c] 



< 


The proofs of Lemma 01 and [3 are based on the Chinese Remainder Theorem. 
Full proofs are given in Appendix C and D of the full version [[2] . We also discuss 
the intuition behind these lemmas in Appendix E of m 

3 Subset Membership Problems in Generic Rings 

Definition 3 (Subset Membership Problem). Let CCZ„ and V C Z n be 

subsets of Z„ such that V C C C Z n . The subset membership problem defined 
by ( C , V) is: given x C, decide whether x G V. 

Whenever considering a subset membership problem in the following we assume 
that |V| > 1. 

Let (C, V) be subsets of Z„ defining a subset membership problem. We formal- 
ize the notion of subset membership problems in the generic ring model in terms 
of a game between an algorithm A and a generic ring oracle O smp . Oracle O smp 
is defined exactly like the generic ring oracle described in Section 12.41 except 
that 0 S mp receives a uniformly random element x C as input. We say that A 
wins the game, if x £ V and _4° smp («) = 1, or x 0 V and A e ’ smp (n) = 0. 

Note that any algorithm for a given subset membership problem ( C,V ) has 
at least the trivial success probability II(C,V) := max{|V|/|C|, 1 — |V|/|C|} by 
guessing, due to the fact that x is sampled uniformly from C. For an algorithm 
solving the subset membership problem given by (C, V) with success probability 
Pr[«S], we denote with 


Adv (c v) (A 0amp (n)) := |Pr[S] - H(C, V)| 


the advantage of A. 

Theorem 1. For any generic ring algorithm A solving a given subset member- 
ship problem (C,V) over Z n with advantage Adv( C V )(A° smp (n)) by performing 
m queries to 0 S m P . there exists an algorithm B that outputs a factor of n with 
success probability at least 


Adv ( c,y)("4 0smp (n)) 
2 m(m 2 + 5m + 3) 


\\U[C]\) 


by running A once and performing 0(m 3 ) additional operations in Z n , to gcd- 
computations on [log 2 n] -bit numbers, and sampling m random elements from 
U[C). 
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Proof Outline. We replace O smp with a simulator 0 s i m . Let 5 s ; m denote the 
event that A is successful when interacting with the simulator, and let T denote 
the event that C^im answers a query of A different from how O srnp would have 
answered. Then O smp and O s ; m are indistinguishable unless T occurs. There- 
fore the success probability Pr[«S] of A in the simulation game is upper bound 
by Pr[iS sim ] + Pr[.F]. We derive a bound on Pr[5 s i m ] and describe a factoring 
algorithm whose success probability is lower bound by Pr[.F]. 

3.1 Introducing a Simulation Oracle 

We replace oracle (9 smp with a simulator O s i m . O a im receives i^Cas input, but 
never uses this value throughout the game. Instead, all computations are per- 
formed independent of the challenge value x. Note that the original oracle O siap 
uses x only inside the testQ and equal() procedures. Let us therefore consider 
an oracle O s i m which is defined exactly like £> smp , but replaces the procedures 
test() and equalQ with procedures testsim() and equalsim(). 

— The testsim()-procedure samples x r C and returns false if o = / and 
Pj(x r ) 0 Z* , and true otherwise (even if Pj(x r ) =T). 

- The equalsim ()-procedure samples x r C and returns true if Pi(x r ) = 
Pj{x r ) mod n and false otherwise (even if Pi{x r ) =T or Pj(x r ) =T). 

Note that the simulator samples m random values x r , r fi fl, . . . , m}. Also note 
that all computations of A are independent of the challenge value x when inter- 
acting with O siui . Hence, any algorithm A has at most trivial success probability 
in the simulation game, and therefore 

Pr[S sim ] < II(C,V). 

3.2 Bounding the Probability of Simulation Failure 

We say that a simulation failure, denoted T, occurs if CQ m does not simulate 
O srnp perfectly. Observe that an interaction of A with O s \ m is perfectly indis- 
tinguishable from an interaction with O srnp , unless at least one of the following 
events occurs. 

1. The testsim()-procedure fails to simulate test() perfectly. This means that 
testsim() returns false on a procedure call where test() would have returned 
true, or testsim() returns true where testQ would have returned false. Let 
^test denote the event that this happens on at least one call of testsimQ. 

2. The equalsimQ-procedure fails to simulate equalQ perfectly. This means that 
equalsimQ has returned true where equalQ would have returned false, or 
equalsimQ has returned false where equalQ would have returned true. Let 
•^equai denote the event that this happens at at least one call of equalsimQ. 

Since T implies that at least one of the events T’test and -F eq uai has occurred, it 
holds that 

Pr[^l < Protest] + Pr[-F equal ]. 

In the following we will bound Pr[.F test ] and Pr[T" equa |] separately. 
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Bounding the Probability of ^F test . The testsim()-procedure fails to simulate 
test() only if either testsim() has returned false where test() would have returned 
true, or testsim() has returned true where test() would have returned false. A 
necessary conditiorQ for this is that there exists Pj C P and x r £ { x \ , . . . , x m } 
such that 

(P.,0) £ Z; and Pj(x r ) # Z*) or (Pj(x) =± and Pj (x r ) & Z*), 


( Pj(x r ) £ Z* and P f (x) <£ Z* ) or (Pj(x r ) =_L and P/O) 0 Z*). 

We can simplify this condition a little by applying Lemma El The existence of 
Pj C P and x r such that ( Pj(x r ) =T and Pj(x) £ Z*) implies the existence 
ofPfeCP such that k < j and ( Pk(x r ) ^ Z* and Pk(x) £ Z*). An analogous 
argument holds for the case ( Pj(x ) =jL and Pj(x r ) £ Z*). Hence, testsim()- 
procedure fails to simulate test() only if there exists Pj C P such that 

{Pj(x) £ Z* and Pj{x r ) $ Z*) or (Pj(x r ) £ Z* and Pj(x) # Z*). 

Proposition 1 

Pr[P test ] < 2 m(m + 2) ^max |Pr |P/0) ^ Z* and Pj{x') £ Z* | a;, x' cj | 

We sketch the proof of Proposition 0 in Appendix [01 A full proof is given in 
Appendix F of the full version. 


Bounding the Probability of jF equa |. The equalsim()-procedure fails to sim- 
ulate equalQ only if either equalsim() has returned false where equalQ would 
have returned true, or equalsim() has returned true where equal() would have 
returned false. A necessarjH condition for this is that there exist Pj, Pj C P and 
x r £ {xi , . . . , x m } such that 


(P»0) = n Pj(x) and Pi(x r ) Pj(x r )) 

or (Pj(» =„ Pj(x) and (P*0r) =T or P/Or) 
or (Pi(x r ) = n Pj(x r ) and P^x) ± n Pj(x)) 
or (Pi(x r ) = n Pj(x r ) and (P<(x) =T or Pj(x) = ..)). 

Again we can apply Lemma El to simplify this a little: the existence of Pj £ P 
and x r such that ( P/Or ) =tj( and Pj(x) ^T) implies the existence of P*, £ P 
such that (P k (x r ) $ Z* and Pk(x) £ Z*). Analogous arguments hold for the 

2 The condition is not sufficient, because algorithm A need not have queried a division 
by Pj in its r-th query. 

3 The condition is not sufficient, because algorithm A need not have queried (i. j, =) 
in its r-th query. 
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other cases where one straight line program evaluates to _L. Hence, equalsim()- 
procedure fails to simulate equal() only if there exist P,;, Pj C P or P k C P such 
that 


(. Pi(x ) = n Pj{x ) and Pi{x r ) ^ „ Pj(x r )) 
or (Pi(x r ) = n Pj(x r ) and Pi(x) ^ „ Pj(x )) 
or (P k ( x T ) $ Z* and P k {x) G Z*J 
or (P k (x) # Z* and P k (x r ) G Z*). 


Proposition 2 


Pr[P e quai] < 2 m(m 2 + 3m + 1)<P + 2 m(m + 1)P, 


where 

P = _ i max < {Pr =„ Pj(x) and P l {x') ^ n P f ( x') \ x,x' Z- c] } 

P = ^max {Pr [-Pfc(z) 0 Z* and P k {x') G Z* | x, x' ¥- cj | . 

The proof of Proposition 0 which is based on the same ideas as the proof of 
Proposition 0 is given in Appendix G of the full version. 


Bounding the Probability of T. Summing up, we obtain that the total 
probability of T is at most 

Pr[P] < Pr[P test ] +Pr[P equal ] 

< 2 m(m 2 + 3m + 1)P + 4m(m + 1)P. 
where P and P are defined as above. 

3.3 Bounding the Success Probability 

Since all computations of A are independent of the challenge value x in the 
simulation game, any algorithm has only the trivial success probability when 
interacting with the simulator. Thus the success probability of any algorithm 
when interacting with the original oracle is bound by 

n(C,V) + Adv (c ,v)(^ amp ) = Pr[5] < Pr[<S sim ] + Pr[P] < 77(C, V) + Pr[P], 
which implies 

Adv (c ,vj(^°— ) < Pr[P]. 


3.4 The Factoring Algorithm 

Consider a factoring algorithm B running A , recording the sequence of queries 
A issues, and proceeding as follows. 
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— Whenever the algorithm submits (i,j, o) with o £ {+, —,*,/} in its r-th 
query, the algorithm samples y *— U[C\ and computes gcd(Pj, (y ) , n) for 
0 <k<r. 

— Whenever the algorithm submits (i,j, o) with o £ {=} in its r-th query, 
the algorithm samples y <— U [C] and computes gcd {Pi(y) — Pj(y),n ) for 
-1 <i<j<r. 


Running time. By assumption, A submits m queries. Thus, the algorithm eval- 
uates 0(m 2 ) straight line programs. Each query can be evaluated by performing 
at most m steps, which yields 0(m 3 ) operations in Z n . Moreover, the algorithm 
samples m random values y from U [ C\ and performs m gcd-computations on 
[log 2 n]-bit numbers. 


Success probability. B evaluates any straight line program P k with a uni- 
formly random element y of U [C\. In particular, B computes gcd (P k (y),ri) for 
y «— U [C] and the straight line program Pk C P satisfying 

Pr |^Pfe(a:) 0 Z* and P k (x') G Z* | x, x' •£- cj 
= Q max {Pr [p k (x) # Z* and P k (x') G Z* n \ x, x' A c] } . 

Let 7 i := maxo<fc< m {Pr[Pfc(a;) ^ Z* and P k {x') G Z* | x,x' C]}, then 
by Lemma 01 algorithm B finds a factor in this step with probability at least 



Moreover, B evaluates any pair Pi,Pj of straight line programs in P with a 
uniformly random element y -h- U [C\. So in particular B evaluates gcd (Pi(y) — 
Pj(y),n) with y ¥- U[C\ for the pair of straight line programs Pi,Pj C P 
satisfying 

Pr ^ (x) = n Pj(x) and P;( x') £ n P, ( x') \ x,x' A c] 

= i max^ |Pr | Pi(x) =„ Pj{x) and Pi(x r ) ^ n Pj(x') \ x, x' cj j . 

Let 72 := max_i< i< j< TO {Pr[P i (a;) = n Pj(x) and Pt(x') ^ „ Pj(x') \ x, a? C]}, 
then by Lemma 0 algorithm B succeeds in this step with probability at least 
72 ^ ) • ®°> f° r 7 := max { 7 i> 72 }) the total success probability of algorithm 
B is at least 



Relating the success probability of B to the advantage of A.. Using the 
above definitions of 71, 72, and 7, the fact that Adv(c V ) (M e ’ smp (»)) < Pr[^, 
and the derived bound on Pr[P], we can obtain a lower bound on 7 by 
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Adv( CiV )(^4 c ' smp (n)) < Pr[.F] < 4 m(m + 1)71 + 2 m(m 2 + 3m + 1)72 
< 2 m(m 2 + 5m + 3)7, 


which implies the inequality 

> Adv ( c i v ) (A° amp (n)) 
^ — 2 m(m 2 + 5m + 3) 

Therefore the success probability of B is at least 


Adv (CiV) (^ 0 - p (n)) _ / \C\ \ 2 
2 m(m 2 + 5m + 3) \ |ZY [C] | / 


Adv (CiV )(yl 03mp (n)) 
2 m(m 2 + 5m + 3) 


4 Computing the Jacobi Symbol with Generic Ring 
Algorithms 

Let us denote with QR n C the set of quadratic residues modulo n, i.e. 


QR„ := {x € Z* x = y 2 mod n,y € Z* }. 


Let ( x | n ) denote the Jacobi symbol ££3 p.287] and let J n := {x £ Z„ | (x \ n) = 
1} be the set of elements of Z„ having Jacobi symbol 1. Recall that QR„ C J n , 
and therefore given x G Z„\J n it is easy to decide that 2; is not a quadratic 
residue by computing the Jacobi symbol. 

There exist simple efficient algorithms computing the Jacobi symbol in Z n 
without factoring n. These algorithms are not generic, cf. |2S1 p.288] . 

Theorem 2. Suppose there exist a generic ring algorithm A solving the subset 
membership problem given by ( C,V ) with C = Z* and V = J n with advantage 
Ad V(c,v) (-4 e ’ smp ( n )) by performing m ring operations. Then there exists an algo- 
rithm B finding a factor of n with probability at least 


Adv(c, v) (A 0smp (re)) 

2 m(m 2 + 5m + 3) 


by running A once and performing 0(m 3 ) additional operations in Z n , m gcd- 
computations on [Jog 2 n] -bit numbers, and sampling m random elements from 

K- 

Proof. The theorem follows by applying Theorem [Q and the fact that U [Z* ] = 
Z* , since 
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5 The Generic Quadratic Residuosity Problem and 
Factoring 

Definition 4 (Quadratic Residuosity Problem). The quadratic residuosity 
problem is the subset membership problem given by C = J n and V = QR.„. 

Given the factorization of n, solving the quadratic residuosity problem in Z n is easy, 
also for generic ring algorithms. Thus, in order to show the equivalence of generic 
quadratic residuosity and factoring, we have to prove the following theorem. 

Theorem 3. Suppose there exist a generic ring algorithm A that solves the 
quadratic residuosity problem in Z n with advantage Adv( C>V )(.4 c ’ amp (n)) by per- 
forming m ring operations. Then there exists an algorithm B finding a factor of 
n with probability at least 


Ad v (c ,v)(/- p (»)) 
8 m(m 2 + 5m + 3) 


by running A once and performing 0(m 3 ) additional operations in TL n , m gcd- 
computations on [log 2 n] -bit numbers, and sampling m random elements from Z* . 

Proof. The cardinality | J„ | of the set of elements having Jacobi symbol 1 depends 
on whether n is a square in N. 



<{>(n)/ 2, if n is not a square in N, 
<{>(n) , if n is a square in N, 


where <£(■) is the Euler totient function [231 p.24]. Note also that U [J„] =U\C] = 
Z*. Therefore it holds that \ J n \ = \C\ > <j>{n)/2 and \U[C\ \ = |Z* | = cj){n). Thus 
we can apply Theorem [fl using that 



6 The Generic Subgroup Decision Problem and Factoring 

Let n=pq and let G be a cyclic group of order n. Then there exists a subgroup 
G p C G of order p. 

Definition 5 (Subgroup Decision Problem). The subgroup decision prob- 
lem is the subset membership problem (C, V) with C = G and V = G p . 

Recall that any cyclic group of order n is isomorphic to the additive group of 
integers (Z n , +). Now, since we are going to consider generic algorithms, we may 
assume that the algorithm operates on the group G = (Z„, +), of course without 
exploiting any property of this representationjj Assuming an oracle DH solving 

4 One may equivalently assume that the generic group oracle uses the group (Z n ,+) 
for the internal representation of group elements. 
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the Diffie-Hellman problem in G, we observe that this operation corresponds 
to the multiplication in Z n . Hence, the group G together with oracle DH exhibits 
the same algebraic structure as the ring Z„. 

By the Chinese Remainder Theorem, the ring Z n is isomorphic to the direct 
product Z p x Z q . Let r/> : Z p x Z q — > Z n denote this isomorphism. The subgroup 
G p of G with order p consists of the elements G p = 0) | x p G Z p }. So for 

generic ring algorithms the subgroup decision problem can be stated as: given 
x G Z n , decide whether x = 0 mod q. 

In order to model the generic subgroup decision problem, consider an oracle 
C> sf jp which is defined exactly like the generic ring oracle described in Section I2~T1 
except that it does not provide the operation /. Cbdp receives an element x G Z n 
as input, where x is constructed as follows: sample (x p ,x q ) •£- Z p x Z 9 and bit 
b {0, 1} uniformly random, and let x := <j)(x p ,bx q ). An algorithm can query 
the oracle for the (inverse) group operation by submitting a query (i,j, o) with 
o G {+,—}• The Diffie-Hellman oracle is queried by submitting (i, j, o) with 
Q G {■}. 

We say that the algorithm wins the game, if x G G p and A opdp (n) = 1, or 
x £ G p and A° eAp {ri) = 0. We define the advantage of an algorithm A solving 
the subgroup decision problem with probability Pr[<S] as 

Adv(A°-(n)) : = Pr[ 5 ]-Q-i)|. 

Remark 1. If we would also allow to query the oracle for divisions (which cor- 
respond to an “inverse Diffie-Hellman oracle” in the above setting), then there 
would be a simple algorithm determining whether x G G p by returning true iff 
division by x fails. Interestingly, we will show that there is no generic algorithm 
making similar use of a standard Diffie-Hellman oracle, unless factoring n is easy. 
Therefore a further consequence of the theorem presented in the following section 
is that a standard Diffie-Hellman oracle does not imply a inverse Diffie-Hellman 
oracle in general, unless factoring is easy. 

Remark 2. The subgroup decision problem was introduced in jS] for groups with 
bilinear pairing. Essentially such a pairing can be added to the generic model by 
allowing the algorithm to perform a single multiplication operation when eval- 
uating the bilinear pairing map0 as done in £1] . By providing a Diffie-Hellman 
oracle, we do not restrict the algorithm to a fixed number of multiplications. 
Hence, our proof includes the problem stated in jSj as a special case. 


6.1 Generic Equivalence to Factoring 

In the sequel we show that solving the subgroup decision problem in groups of 
order n is as hard as factoring n, even if the algorithm has access to an oracle 
solving the Diffie-Hellman problem. 


Plus some minor technical details to distinguish between different groups. 
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Theorem 4. Suppose there exist a generic ring algorithm A solving the sub- 
group membership problem in G with advantage Adv(A c,8dp (n)) by making m 
queries to an oracle performing the (inverse) group operation and solving the 
Diffie- Heilman problem. Then there exists an algorithm B finding a factor of n 
with probability at least Adv(M OBdp (n)) by running A once and performing 0(m 3 ) 
additional operations in Z n and m gcd- computations on [dog 2 n\ -bit numbers. 

Proof. Let us consider an interaction of A with an oracle O p which is defined as 
follows. O p works similar to O s d P , but performs all computations in Z p . That is, 
the equalQ-procedure returns true on input (i,j) iff P%(x) = Pj(x) mod p. Note 
that now all computations are performed in the Z p -component of the decompo- 
sition Z p x Z q of Z„, hence the algorithm receives no information on whether 
x = 0 mod q. Thus in the simulation game any algorithm has only trivial success 
probability Pr[<S s ; m ] = 1/2+1 /q. 

Now consider an interaction of A with oracle Ctjdp- Either this interaction 
is indistinguishable from an oracle O p , in which case the algorithm has only 
trivial success probability, or there exist Pi,Pj E P with such that Pj( x) = 
Pj(x) mod p, but Pi(x) ^ Pj{x) mod n. In this case a factor of n is found by 
computing gcd(Pj(a;) — Pj (x) . n) . Note that 

\ + Adv (CjV) (yl e ’ adp (n)) < Pr[«s 8im ] + Pr[P] 

4=^ Adv (c v) (_A e ’ adp (n)) < Pr[P] 

Thus, n is factored this way by running A, recording P and computing 
gcd(Pj(a;) -Pj{x),n) 

for all —1 < i < j < m with probability at least Adv(c.v) (-4° sdp (*»))■ 

The above proof generalizes from n = pq to n = n*=i pT f° r subgroups with 
prime-power order p ® 4 in a straightforward manner. 

7 Analyzing Search Problems in the Generic Ring Model 

In Section 0 we have constructed a simulator for a generic ring oracle for the ring 
Z n . When interacting with the simulator, all computations are independent of 
the secret challenge value x. Therefore we have been able to conclude that any 
generic algorithm has only the trivial probability of success in solving certain 
decisional problems (namely the considered subset membership problems) when 
interacting with the simulator. Moreover, we have shown that any algorithm 
that can distinguish between simulator and original oracle can be turned into a 
factoring algorithm with (asymptotically) the same running time. 

In contrast to decisional problems, where the algorithm outputs a bit, our 
construction of the simulator can also be applied to prove the generic hardness 
of search problems where the algorithm outputs a ring element or integer. Let 
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us sketch two possibilities. The first one is to formulate a suitable subset mem- 
bership problem which reduces to the considered search problem and then apply 
Theorem d Another possibility is to use our construction of the simulator to 
bound the probability of a simulation failure relative to factoring. In order to 
bound the success probability in the simulation game, it remains to show that 
there exists no straight line program solving the considered problem efficiently 
under the factoring assumption. 
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A Proof Sketch for Lemma |BI 

For notational convenience, let us define P(P) := Pr[P(a/) 0 Z* and P{x) G 
Z* | x, x' C] and A{P) := Pr[gcd(n, P(y)) 0 {1, n} | y ■£- ZY[C]]. Thus, in 
order to prove Lemma 0 we have to show that the inequality 



(1) 


holds. To this end, we will define an auxiliary function i/j(P). Then we express 
F(P) and A(P) in terms of i'i(P). More precisely, we will upper bound F(P) by 
an expression in z'j(P) and lower bound A(P) by an expression in i/j(P). The 
resulting inequality is proven easily by complete induction. 
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Defining an auxiliary function. Recall that we denote with n = Yii=i pT 
the prime factor decomposition of n. Let 

i/*(P) := Pr [p(z) = 0 mod pi\x£-U [C]] 

be the probability that P(x) = 0 mod p t for some prime p t dividing n and x •£- 
U [C\. Recall that <p : Z p «i x • • • x Z p « fc — > Z„ is a ringisomorphism, and P performs 
only ring operations in Z n . Therefore P implicitly performs all operations on each 
component separately (and independently). Moreover, sampling x <— U [C] 
is equivalent to sample 4>(x \, . . . , Xk) with Xi chosen independently and uniform 
from Ci for 1 < i < k (cf. Lemma . Thus we can express the probability that 
P(x) G Z* for x A U [C] as 

k 

Pr [P(x) £K\x^U [C]] = J[(l - Vi(P)). 


Bounding T'(P) in terms of For independently sampled x. x' , we 

have 

P(P ) = Pr [. P(x ' ) 0 Z; and P(x) G Z* n \ x, x' Z- c] 

= Pr [p( x) £ K | x Z- C] • Pr [p(®) G Z* | x Z- c] 

Note that, since C C U [C], it holds that 

Pr [P(x) G Z; | X^C] < Pr [p(y) G Z; | y^U[C}] 
and similarly 

Pr [p(z) 0 Z* | x Z- C] < (l - Pr [P(y) G Z; | y Z- U [C]\ ) Ml. 
Therefore we can conclude that 

r(P) < Pr [p(y) G z; I y^U [C\] (l - Pr [p(y) G K I V £ U [C]]) 

= I[(l^«)(l-r[(l-^))) (nf^) 2 - ( 2 ) 

Bounding A(P) in terms of VjfP). We can find a factor of n by computing 
gcd(n, P{y)), if P(y) = 0 mod Pi for at least one prime p t dividing n, and P(y) ^ 
0 mod n. Using similar arguments as above, we can therefore express A(P) in 
terms of Ui{P) as 


416 T. Jager and J. Schwenk 


A{P) = Pr [gcd(n, P{y)) <£{l,n}\y£- c] 


k 


k 




(3) 


Putting things together. Combining @ and (0. we see that (0 holds if 



holds, which is shown easily by complete induction on k > 2. 

B Proof Sketch for Proposition [T] 

If there exists Pj such that ( Pj(x ) =_L and Pj(x r ) ^_L), then this implies that 
there exists P k C P with k < j such that (. Pj(x r ) £ Z* and Pj(x) G Z*) 
by Lemma 0 Hence, in order to bound the probability of .Ftest, it suffices to 
consider the probability that there exists a straight line program Pj C P such 
that 

(. Pj(x r ) ^ Z* and Pj{x) 6 Z*) or ( Pj{x ) ^ Z* and Pj(x r ) G Z*) (4) 

for x,xi, . . . ,x m C. 

By (essentially) applying the union bound we can see that for fixed Pj this 
probability is bounded by 


2m Pr |Pj(:r) 0 Z* and Pj{x') G Z* | x, x' cj 


Using this, we obtain the following bound on the probability that there exists 
any Pj C P satisfying 0. 


Protest] < 2m^Pr [Pj(x) £ Z* and Pj(x') G Z* | x, x' c] 


< 2m(m+ 1) ^mmc |Pr ^(a;) ^ Z* and Pj{x') G Z* | x,x' cj | 
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Abstract. We revisit previous formulations of zero knowledge in the random 
oracle model due to Bellare and Rogaway (CCS ’93) and Pass (Crypto ’03), and 
present a hierarchy for zero knowledge that includes both of these formulations. 
The hierarchy relates to the programmability of the random oracle, previously 
studied by Nielsen (Crypto ’02). 

- We establish a subtle separation between the Bellare-Rogaway formulation 
and a weaker formulation, which yields a finer distinction than the separation 
in Nielsen’s work. 

- We show that zero-knowledge according to each of these formulations is not 
preserved under sequential composition. We introduce stronger definitions 
wherein the adversary may receive auxiliary input that depends on the 
random oracle (as in Unruh (Crypto ’07)) and establish closure under 
sequential composition for these definitions. We also present round-optimal 
protocols for NP satisfying the stronger requirements. 

- Motivated by our study of zero knowledge, we introduce a new definition of 
proof of knowledge in the random oracle model that accounts for oracle- 
dependent auxiliary input. We show that two rounds of interaction are 
necessary and sufficient to achieve zero-knowledge proofs of knowledge 
according to this new definition, whereas one round of interaction is 
sufficient in previous definitions. 

- Extending our work on zero knowledge, we present a hierarchy for circuit 
obfuscation in the random oracle model, the weakest being that achieved in 
the work of Lynn, Prabhakaran and Sahai (Eurocrypt ’04). We show that the 
stronger notions capture precisely the class of circuits that is efficiently and 
exactly leamable under membership queries. 

Keywords: zero-knowledge, random oracle model, sequential composition, 
obfuscation. 


1 Introduction 

The random oracle (RO) model, introduced by Fiat and Shamir irm and refined by 
Bellare and Rogaway 0, was proposed as a framework for designing and analyzing 
cryptographic schemes that offers a trade-off between provable security and practical 
efficiency. In this model, every party has oracle access to a truly random function. 
With this additional functionality, many cryptographic problems admit more efficient 

* Work done while visiting Tsinghua University, Beijing, China. 
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solutions than in the standard model, along with considerably simpler proofs of 
security IBI4I24I1 II . In practice, the idealized random function is instantiated using a 
“good” cryptographic hash function, like SHA-1 or a variation thereof. There are also 
cryptographic problems for which we have partial solutions in the random oracle model 
but not in the standard model, most notably that of circuit obfuscation H1I19120I . In 
both cases, proofs in the random oracle model do not guarantee security or feasibility in 
the standard model (and in fact, there has been substantial evidence to the contrary 
HEED ; nonetheless, the model provides a useful idealized test-bed for analyzing 
cryptographic schemes. 

As a first step towards establishing security, it is necessary to define security in 
the random oracle model. A naive extension of a definition in the standard model 
may affect the semantics of the underlying notion of security. Consider the case of 
zero-knowledge proofs, namely proofs that yield no knowledge beyond the validity of 
the assertion proved na . Formally, an interactive protocol is zero-knowledge if there 
exists a simulator that can simulate the behavior of every, possibly cheating, verifier 
without access to the prover, such that its output is indistinguishable from the output 
of the verifier after having interacted with the honest prover. In the standard model, 
a zero-knowledge proof is necessarily deniable, in that the protocol’s transcript does 
not constitute any evidence of the interaction, since any party could have generated the 
transcript by himself. However, the Bellare-Rogaway formulation of zero-knowledge 
in the random oracle model does not imply deniability, since the simulator can choose 
the random oracle G3ZH In particular, the formulation allows for (non-trivial) one- 
round zero-knowledge proof systems, and the transcript of such a protocol constitutes 
evidence of participating in the protocol, contradicting deniability. 

In this work, we revisit two aspects of formulating zero-knowledge in the random 
oracle model. The first relates to defining security in the random oracle model and 
in particular, what it means to choose the random oracle, an issue first addressed by 
Nielsen lETll . The second relates to a different aspect of zero-knowledge proofs, namely, 
we want the zero-knowledge guarantee to hold even if the verifier may have some 
additional a priori information about the input. The need to account for such auxiliary 
input, which arises in typical applications such as sequential repetitions of a protocol, 
was articulated in the work of Goldreich and Oren mu and again in that of Unruh 
E3- While the Bellare-Rogaway formulation of zero-knowledge does take into account 
auxiliary input, it does not allow for dependencies between the auxiliary input and the 
random oracle, which arise for instance, when the auxiliary input is a transcript of a 
previous interaction using the oracle. 

1.1 Programmability in the Random Oracle Model 

There are two reasons why, in the simulation-based paradigm, it is easier to achieve 
security in the random oracle model: 

- the simulator can see the queries parties make to the random oracle; 

- the simulator can choose the answers to these queries. 

The second is what we refer to as programming the random oracle, and may be qualified 
in several different ways. Suppose our goal is to simulate a transcript RO(s), namely 
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the evaluation of the random oracle RO at some value s. Our intuition about the 
random oracle as a truly random function indicates that picking a truly random string r 
should suffice (essentially choosing the evaluation of RO at s to be r), and indeed, no 
distinguisher - even computationally unbounded ones - can distinguish a truly random 
string from RO(.s), provided the distinguisher does not get access to RO. On the other 
hand, if we give the distinguisher access to RO, then the only “good” simulation of 
the transcript is RO(s), and the simulation must query RO at s. This is because the 
distinguisher may have s hardwired into it, then queries RO at s and checks whether 
the answer matches the transcript. In this setting, the simulator does not get to choose 
the answers to oracle queries. To distinguish between these two notions of security, 
we will refer to the former as the fully programmable random oracle (FPRO) model, 
and the latter as the non-programmable random oracle (NPRO) model (as coined by 
Nielsen EU). 

In the case where we allow the simulator to choose the answers to oracle queries, 
we may still impose an additional requirement, namely that the simulator must output 
its choices of these query/answer pairs. In the above example, whether the simulator 
chooses the output of RO at s to be some random string r, its output will include the 
transcript r, along with the list (s, r), corresponding to the query s and answer r. This 
is in fact the notion of programmability raised by Bellare and Rogaway 0 for zero- 
knowledge, and we will refer to this as the explicitly programmable random oracle 
(epro) model. We defer a precise definition to the body of the paper, but note at this 
point that security in the non-programmable random oracle model (strongest security 
guarantee) implies security in the explicitly programmable random oracle model, which 
in turn implies security in the fully programmable random oracle model (weakest 
security guarantee). 


1.2 Our Contributions and Techniques 

Hierarchy for zero knowledge. We begin with a simple and unified framework for 
defining zero knowledge in the three variants of the random oracle model, and then 
present a (perhaps surprising) separation for zero knowledge in the fully programmable 
and explicitly programmable random oracle models. This yields a finer separation than 
that in Nielsen’s work gd, and complements Pass’s separation for zero knowledge in 
the explicitly programmable and non-programmable random oracle models. 


Auxiliary input and sequential composition. Following the work of Goldreich et al. 

for zero-knowledge in the standard model, we use closure under sequential 
composition as a yardstick for evaluating formulations of zero-knowledge. We show 
that zero-knowledge in all three variants of the random oracle model are not closed 
under sequential compositiorQ. This motivates a new formulation of zero-knowledge 
in the random oracle which allows for auxiliary inputs that depend on the oracle, as 


1 That this may be the case has been previously noted (e.g. E3), but to our knowledge, there 
has been no formal (published) proof. 
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was done in m for one-way functions, encryption and other primitives!! We show that 
for efficient-prover protocols, zero-knowledge with oracle-dependent auxiliary input in 
the explicit-programmable and non-programmable random oracle models are preserved 
under a polynomial number of sequential repetitions. We also present round-optimal 
protocols for N P satisfying the new formulations of zero-knowledge. 

Proofs of knowledge. Our constructions demonstrating that previous formulations of 
zero knowledge are not closed under sequential composition implicitly rely on a non- 
interactive zero-knowledge “proofs of knowledge” in the random oracle model. Specif- 
ically, non-interactive protocols are necessarily malleable (without unique identifiers), 
and the cheating verifier can generate a convincing proof of knowledge by copying 
one sent by the prover in a previous iteration of the protocol. This motivates a new 
formulation of proof of knowledge in the random oracle model that takes into account 
oracle-dependent auxiliary input. We show that two rounds of interaction are necessary 
and sufficient to achieve zero-knowledge proofs of knowledge according to this new 
definition. 

Circuit obfuscation. We extend our framework for programmability to circuit obfusca- 
tiot{3 in the random oracle model 111 911 1 . and note that the obfuscator constructions of 
Lynn et al. fiH achieve security in the fully programmable random oracle model. Next, 
we show circuit obfuscation in the explicit-programmable random oracle model can 
only be realized for classes of circuits that are efficiently and exactly learnable under 
membership queries, and for these classes, obfuscation may be (trivially) realized in 
the plain model, so the characterization is exact. We find it surprising that we can have 
non-trivial constructions in the explicitly programmable model for zero knowledge but 
not for circuit obfuscation. 


1.3 Discussion 

Formulating zero-knowledge. A general framework for defining security in the random 
oracle model was presented by Nielsen eh, based on augmenting the universally 
composable (UC) framework 0 with a random oracle functionality. This guarantees 
composability. As pointed out by Pass 1221 - deniability is not guaranteed in this 
framework. Nielsen also defined security with a non-programmable random oracle, 

2 For the primitives considered in 1251 . the random oracle is typically only used in the proof 
of security. Specifically, Unruh 1251 does not explicitly address primitives with a simulation- 
based notion of security, which is the focus of this work and where the random oracle is also 
exploited in constructing a simulator. On the other hand, Unruh considers a stronger notion 
of oracle-dependent auxiliary input, where a polynomial bound is imposed only on the output 
length of the machine generating the auxiliary input and not its query complexity. 

3 We use the term obfuscation to refer to the stronger notion of obfuscation against general 
adversaries, instead of obfuscation against predicate adversaries 111261 . In the standard model, 
only classes of circuits that are efficiently and exactly learnable under membership queries 
are obfuscatable against general adversaries ESI . The result also extends to the fully- 
programmable RO model. 
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where the environment in the UC framework is also given access to the random oracle. 
This offers deniability, but may no longer guarantee universal composability. 

Instead of adopting Nielsen’s formulation, we consider a minimal framework (based 
on 1131 141 1 for which we can provide the weaker guarantee of sequential composition. 
The simplicity allows us to focus on how the random oracle is incorporated differently 
in each of 12 1131221 . In addition, it offers several conceptual advantages: it offers 
modularity (which allows us to decouple the zero-knowledge property from the proof- 
of-knowledge property and other properties implied by UC zero-knowledge, and for 
impossibility results, these distinctions are particularly important) and reinforces the 
theme of this work, that of understanding how semantics can change between the 
standard model and the random oracle model. Furthermore, our framework is simple 
enough to be applied to circuit obfuscation, for which we have very few non-trivial 
positive results, let alone constructions that compose arbitrarily (which probably only 
exist for trivially obfuscatable families of circuits). 

Sequential composition not the end-goal. We recall the arguments used in to 
motivate the study of auxiliary-input zero-knowledge: first, it fully captures the intuitive 
meaning of the concept of zero-knowledge; and second, this stronger requirement is 
necessary when a zero-knowledge protocol is used as a sub-protocol within larger 
cryptographic protocol^]. It is for these same reasons that we pursue a formulation 
of zero-knowledge in the random oracle model that incorporates auxiliary input (refer 
to l!251l for additional arguments). Indeed, we regard our sequential composition lemma 
as evidence that we have properly accounted for auxiliary input in our formulation 
and not a goal in and of itself. Similarly, constructing protocols for NP that remain 
zero-knowledge under sequential composition should not be an objective in itself0 
Neither should a generic method for transforming protocols that are zero-knowledge 
into another that remain zero-knowledge under sequential composition. 

On “explicit programmability”. From previous work 1131221 191261 . we know that 
allowing the simulator to program the random oracle is necessary and sufficient for 
one-round zero-knowledge protocols for NP and obfuscating point functions in the 
random oracle model. However, while explicit programmability is sufficient for zero- 
knowledge, we show that full programmability is necessary for the latter. This means 
that the reason we are able to realize non-trivial circuit obfuscation in the random oracle 
model comes not only from programming the random oracle, but also from not having 
to specify explicitly how we program the random oracle. 

The issue of explicit programmability also arises in the study of sequential composi- 
tion. To obtain zero-knowledge that is closed under a polynomial number of sequential 

4 One may ask, why not aim for universal composability then? This is addressed in the previous 
paragraph, and as with previous work in the standard setting, we feel that zero-knowledge w.r.t. 
auxiliary input is indeed the right compromise. 

5 All “natural” zero-knowledge protocols for NP in the RO model (in the 0 sense) remain zero- 
knowledge under sequential, even concurrent, composition, but this does not obliterate the 
need for the “right” definition. After all, when auxiliary-input zero-knowledge was introduced, 
all known zero-knowledge protocols were black-box and therefore remained zero-knowledge 
under sequential composition. 
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compositions, it appears that explicit programmability is necessary, in addition to 
properly accounting for auxiliary input. 

Self-composition for circuit obfuscation. Lynn et al. introduce self-composition for 
circuit obfuscation (Hi. a notion of composition analogous to sequential composition. 
In addition, they give an obfuscator for point functions in the random oracle model 
that is not 2-self-composing. This is because the construction is not a valid obfuscator 
w.r.t. dependent auxiliary input. To obtain polynomial self-composition for obfuscation 
using techniques in this work, we will need a definition that incorporates both oracle- 
dependent auxiliary input and explicit programmability. 


2 Preliminaries 

A negligible function is a function of the form n - ^ 1 \ and is denoted neg(n). We use 
PPT as an abbreviation for a probabilistic (strict) polynomial-time Turing machine. We 
also consider the nonuniform and oracle analogues, which we denote by nonuniform 
PPT and oracle PPT respectively. In probability expressions that involve a probabilistic 
computation, the probability is also taken over the internal coin tosses of the underlying 
computation. We refer the reader to 11721 for definitions of interactive proof systems, 
zero-knowledge, proofs of knowledge and witness-indistinguishability (WI) in the 
standard model. For a relation R C {0, 1}* x {0, 1}*, the language associated with 
RisL R = {x: By {x,y) G R}. 


3 Zero Knowledge in the Random Oracle Model 

In this section, we present our hierarchy of formulations for zero knowledge in the RO 
model, along with those that account for oracle-dependent auxiliary input. We begin 
with several formalisms we will use in defining zero knowledge: 

- We use RO to denote the random oracle and e to denote an oracle that returns the 
empty string on all inputs. 

- Given a function / : (0, 1}* — > {0, 1}* and a list l C {0, 1}* X {0, 1}*, we use 
f[£) to denote a function that agrees with / everywhere except on inputs specified 
by the set £. Specifically, 


f[i] {x) = [ V if 3ly SUCh that e <L 
1 f(x) otherwise 

Informally, we refer to /[£] as the function obtained by programming / on the 
inputs in £. In the definition of zero-knowledge, the simulator generates a pair (r, £) : 
the simulator programs the random oracle on the inputs in £, and r corresponds to 
the view of the cheating verifier while interacting with the prover using the oracle 

ro[£]. 
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- We allow the auxiliary input to be generated by a nonuniform oracle PPT Z 
(the nonuniformity allows for auxihary information that may depend on the input 
instance) which we refer as the auxiliary input machine. We will give Z oracle 
access to either e or RO, depending on whether we allow the auxiliary input to 
depend on the random oracle. 

Definition 1 (zero-knowledge H22I)- Let ( P , V) be an interactive protocol for a 
language L = Lr. Let V* ,S,Z,D be oracle PPTs. Given (x, w) € R, we define 
d iffy*’ s 2 zd ( x ; w) to be the quantity 

Pr[z^Z 0 l (l |a:| ); t +- (P*°(w),V* R0 (z))(x): D°*(x,t,z) = lj 
-Pr|>«-Z 0 *(l | *l); (t,£)^S ro (x,z); D° 3 (x,t,z) = l} 

We say that ( P , V) is zero-knowledge in the fully programmable random oracle 
model (fpro) if for every oracle PPT V*, there exists an oracle PPT S such that for 
all (x. tu) £ R and for all nonuniform oracle PPTs Z and D, diff v*' e s z D (x,w) is 
negligible (as a function of \x\). In addition, we obtain zero-knowledge in the: 

explicitly programmable RO (epro) model if Oi = e, 0 2 = RO, O3 = RO [l\ 
non-programmable RO (npro) model if Oi = e, 0 2 = RO, O3 = RO 
fpro model w.r.t dependent auxiliary input if Oi = RO, O2 = e, O3 = e 
epro model w.r.t dependent auxiliary input if Oi = RO, O2 = RO, O3 = RO [£] 
npro model w.r.t. dependent auxihary input if Oi = RO, O2 = RO, O3 = RO 

For simplicity, we will also refer to the respective notions of zero-knowledge as fpro 
zero-knowledge, epro zero-knowledge, NPRO zero-knowledge, auxiliary-input fpro 
zero-knowledge, auxiliary-input epro zero-knowledge, and auxiliary-input NPRO zero- 
knowledge. 

Zero-knowledge in the FPRO model. This definition captures the weakest requirement, 
in that the simulator may choose the random oracle in the simulated transcript, as 
long as it “looks” random. We point out that we require simulating the output of the 
cheating verifier, but not the random oracle used in the simulated transcript. This 
is equivalent to a definition wherein the simulator S is given access to RO. Since 
the distinguisher does not have access to RO, the simulator can simply generate 
a random oracle by itself, so giving the simulator access to RO does not give the 
simulator any extra power. Note that this definition also constitutes a relaxation 
of the UC framework augmented with a random oracle functionality (namely, 
that obtained by replacing the interactive environment with a non-interactive 
distinguisher) ETF1 . 

Zero-knowledge in the epro model. The main qualitative difference between fpro 
zero-knowledge and epro zero-knowledgdl is that the simulator is required to 
completely specify a simulated random oracle (namely RO[f?]) in the latter, which 


Indeed, making this distinction in the UC framework would require clumsy modifications. 
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the distinguisher is given access to0 We require that S specifies l explicitly, which 
implies a polynomial bound on the size of l. On the other hand, the oracle RO [(] is 
specified implicitly, epro zero-knowledge is equivalent to the Bellare-Rogaway 
formulation, except the latter does not give the simulator oracle access to RO. 
As with zero-knowledge in the fpro model, this does not make any qualitative 
difference as the simulator can simply generate random answers to the RO queries 
and add these query-answer pairs to the list l. 

Zero-knowledge in the NPRO model. Here, the simulator is not allowed to choose the 
random oracle in the simulated transcript. This implies deniability, and is equivalent 
to Pass’s formulation E3- It is a special case of epro zero-knowledge with £ = 0. 
For efficient-prover protocols, the NPRO zero-knowledge requirement is equivalent 
to requiring that the following quantity be negligible 122151 : 

E RO [|Pr[^£ 0l (lH); 

-Pr[*<-Z°*(lW) ; D ro (x,S ro (x,z),z) = 1] |] 

This is also true w.r.t. dependent auxiliary input. 

Incorporating dependent auxiliary input. Incorporating dependent auxiliary input 
provides some guarantee of “independence” between the queries made to the 
random oracle in the protocol and prior queries, even though we do not know 
what the prior queries are. To achieve this definition, we construct simulators that 
program the random oracle on inputs that have not been previously queried by Z 
(here, we exploit the polynomial bound on the query complexity of Z). Unlike the 
case without auxiliary input, it is essential that we provide the simulator for zero- 
knowledge and epro zero-knowledge with oracle access to RO so that the simulator 
may generate transcripts that are consistent with the output from Z. 

Verifier’s view. A common convention in defining zero-knowledge in the standard 
model is to use (P RO (w) ,V* RO (z))(x) to denote the view of the verifier V* , which 
consists of the protocol’s transcript and the verifier’s random tape, instead of the 
output of the verifier. This is because we may incorporate the computation of the 
output from the view into the distinguisher. This argument does not necessarily 
apply to definitions in the RO model. In this case, the distinguisher does not have 
access to RO and may not be able to compute the output from the view@ Therefore, 
we reserve ( P RO (w ), U* R0 (.z))(ir) to denote the output of the verifier. 

A note on black-box simulation. As with previous works on zero knowledge in the RO 
model, we will establish the zero-knowledge property via black-box simulation, 

7 For some secret value s and a random RO, we may easily simulate a view of RO(s) with a 
random string. However, in order to simulate a view of RO(s) along with an oracle that is 
consistent with this view, we will need to either query RO at s or program RO at s; either 
operation requires “knowing” s. 

8 Simply requiring that the verifier’s query/answer pairs be included in its view may not be 
sufficient as we may also need the prover’s query/answer pairs. 


Zero Knowledge in the Random Oracle Model, Revisited 


425 


except we will allow the simulator to see the oracle queries made by the cheating 
verifier. This is consistent with the definition of zero knowledge because the 
simulator can execute the code of the cheating verifier and observe the oracle 
queries made during the executions. This is a crucial advantage over mere black- 
box simulation of the cheating verifier in the standard model. On the other hand, 
we do not allow the simulator to see the oracle queries made by Z. Consider a 
typical application, namely that of sequential repetitions of the protocol. Here, the 
auxiliary input is a transcript from previous executions of the protocol and may 
therefore depend on the oracle RO. The cheating verifier receives the transcript, 
but does not gain access to the private coin tosses used to generate the transcript. 
The distinction arises from the fact that we allow the simulator to depend on the 
cheating verifier but not on Z. 

4 Zero- Knowledge Protocols and Separations 

Several constructions of zero-knowledge protocols for NP in the RO model were 
given in 11312211 II . It is straight-forward to verify that the zero-knowledge protocol in 
|H is also auxiliary-input epro zero-knowledge. In an unpublished work E3, Pass 
determined the round-complexity of auxiliary-input NPRO zero-knowledge protocols 
for NP. We summarize these results below: 

Theorem 1 (protocols 11312212311 1. Assuming the existence of one-way functions, there 
exist: 


- a one-round proof of knowledge protocol for N P that is auxiliary-input EPRO zero- 
knowledge ( moreover, we may assume that the knowledge extractor is straight-line 
and runs in strict polynomial time 

- a two-round protocol for NP that is NPRO zero-knowledge; and 

- a 3-round protocol for NP that is auxiliary -input NPRO zero-knowledge. 

Furthermore, each of these protocols has perfect completeness, negligible soundness, 
and an efficient proven 

Theorem 2 (triviality H22I23I ). Only languages in BPP have a one-round NPRO zero- 
knowledge protocol or a 2-round auxiliary -input NPRO zero-knowledge protocol. 

We outline the proofs in E3- The 3-round auxiliary-input NPRO zero-knowledge 
protocol for N P is based on the 2-round NPRO zero-knowledge protocol in E3 except 
we have the prover pick a random prefix a in the first round, and prepend a to all 
prover’s and verifier’s queries to the random oracle. The proof of Theorem El follows 
essentially from the fact that the proofs of the analogous statments in the standard model 
lfT?l relativizes. 

Next, we state our first result, separating fpro and epro zero-knowledge. 

Theorem 3. Assuming the existence of one-way permutations, there exists a protocol 
that is auxiliary-input FPRO zero-knowledge but not EPRO zero-knowledge. 
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Fig. 1. Relations between different variants of zero knowledge in the RO model, assuming the 
existence of one-way permutations. An arrow is an implication, and a crossed arrow indicate 
separation. We stress that the relations refer to protocols satisfying the respective notion of zero- 
knowledge. 


Proof. Let n be a one-way permutation, and consider the following protocol for the 
relation R = {(x. us) \ x = 7r(ic)}, where Lr = {0, 1}* (note that soundness holds 
vacuously): 


Common input: An instance x G {0, 1}". 

Prover’s private input: A witness w G {0,1}". 

P V: Sends a ^ - {0, 1}". 

V -> P: Sends r 3- {0, 1}". 

P —> V: If r = RO(ct o w), send w, else, send RO(a o w). 
verification: V always accepts. 

To see that this protocol is auxiliary-input FPRO zero-knowledg^J fix a cheating 
verifier V* (along with its random tape and an auxiliary input 2 from Z ), pick a random 
a, and simulate the execution of V*, forwarding the oracle queries made by V* to 
RO, until we obtain its first message r. During the simulation, we also check if any 
of V*’s queries matches a o w (which we can check efficiently given x). If so, we 
would have recovered w, and may successfully compute the output of V* . If we do 
not manage to recover w, we simulate the prover’s response with a random string r' G 
{0, 1}" and continue to simulate the execution of V*, forwarding all oracle queries 
to RO, unless the query matches ao®,in which case we respond with t ' . This is ok 
because with probability 1 — neg(n) over a, none of the queries made by Z has prefix 
a. This completes the description of the zero-knowledge simulator. 

Suppose on the contrary that the protocol is epro zero-knowledge, and consider the 
simulator S that outputs the view of the honest verifier. Fix x G L, and consider a 
distinguisher with w = ir~ 1 (x) hardwired into it. Then, S must output a transcript that 
contains RO[£] (a o w) with probability 1 — neg(n). For the latter, S must with high 
probability, either query RO at a o w or output a list l that contains the string a o w. In 
both cases, we may derive a PPT that on input x, outputs ir~ 1 (x) with high probability, 
which contradicts 7r being one-way. □ 


Informally, the prover uses (a, RO(a o w)) to check whether the verifier already “knows” w. 
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5 Sequential Composition Fails without Dependent Auxiliary 
Input 

In this section, we present zero-knowledge protocols which are no longer zero- 
knowledge when executed twice sequentially. The protocols are similar in spirit to 
that in a, a zero-knowledge protocol in the standard model that is no longer zero- 
knowledge when executed twice in parallel. The protocols exploit zero-knowledge 
proofs of knowledge (which may be realized non-interactively in the random oracle 
model), using these proofs as the auxiliary input which a cheating verifier could use 
to “gain knowledge”. Specifically, the prover will send the verifier a zero-knowledge 
proof of knowledge of the witness, and the cheating verifier will copy this proof to 
“claim” knowledge of the witness. The apparent contradiction arises from a problem 
in the definition of proofs of knowledge in the random oracle model, an issue we will 
address in Section 0 

Theorem 4. Assuming the existence of one-way functions, fpro zero knowledge, 
epro zero knowledge, and npro zero knowledge are not closed under sequential 
composition. 

Proof (sketch). We begin by constructing an epro zero-knowledge protocol that is 
no longer zero-knowledge when composed twice. The protocol is for the language 
L corresponding to the relation R = {(a:, re) | x = f(w)}, where / is a one-way 
function, and we use as an underlying protocol a one-round epro zero-knowledge proof 
of knowledge protocol (from Theorem[Hl. 


Common input: An instance x G {0, 1}”. 

Prover’s private input: A witness w G {0,1}". 

V — > P: Send a random string r. 

P — ► V: If t is an epro zero-knowledge proof of knowledge that x G L, 
send w, else, send an epro zero-knowledge proof of knowledge 
that x G L. 

verification: V accepts if it receives either w such that f(w) = a or an 
accepting proof of knowledge that x G L. 


To prove epro zero-knowledge, the simulator runs the cheating verifier to obtain the 
first message r. If t is an accepting proof of knowledge for x G L, the simulator runs 
the knowledge extractor to obtain a valid witness w. Otherwise, the simulator runs the 
zero-knowledge simulator for the underlying zero-knowledge protocol to generate the 
second-round message. We would actually require that the underlying zero-knowledge 
protocol be auxiliary-input epro zero-knowledge, which is ok. 

To see that this protocol is not zero-knowledge when composed twice, consider the 
cheating verifier V* that sends a random string in the first execution, and sends the 
prover’s response as its first message in the second execution. For all x G L, the 
transcript between the honest prover and V* (for two sequential repetitions) will contain 
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f~ x (x) with probability 1. That / is one-way implies that there is no PPT simulator for 
two sequential repetitions of this protocol. 

A similar modification to Pass’s 2-round NPRO zero-knowledge protocol for NP 
yields a 2-round npro zero-knowledge protocol that is no longer NPRO zero-knowledge 
when composed twice. □ 

Remark 1. Our counter-example are efficient-prover protocols (looking ahead, our 
sequential composition theorem only holds for efficient-prover protocols). This mean 
that a cheating verifier (which is allowed to be nonuniform) can in fact simulate an 
interaction between the honest prover and the honest verifier. This is different from 
the counter-example in m wherein the cheating verifier cannot simulate such an 
interaction. There, the honest prover is allowed nonuniformity, whereas the cheating 
verifier is not, and the counter-example exploits the fact that the honest prover is “more 
powerful” than the class of cheating verifiers in an essential manner. This distinction 
was previously made in 0. 

6 Sequential Composition with Dependent Auxiliary Input 

Next, we prove a sequential composition lemma for auxiliary-input epro zero- 
knowledge and auxiliary-input npro zero-knowledge, which confirms that these are 
in some sense indeed the “right” definitions. 

Theorem 5 (sequential composition). Let ( P , V) be an efficient-prover protocol in the 
RO model. Let Q(-) be a polynomial, and let ( Pq , Vq) be an interactive protocol that 
on common input x G {0, 1}”, proceeds in Q(n) phases, each of them consisting of 
executing the interactive protocol (P, V) on common input x (with independent coin 
tosses for P). If (P,V) is auxiliary -input EPRO (resp. NPRO) zero -knowledge, then 
(Pq, Vq) is also auxiliary -input EPRO (resp. NPRO) zero-knowledge. 

Proof (sketch). We begin with the proof for epro zero-knowledge. The high-level 
structure is similar to that in 111 411 for establishing a sequential composition lemma for 
zero-knowledge proofs in the standard model. We start by partitioning the cheating 
verifier Vq for (Pq, Vq) into Q(n) phases, each of which is the execution of a verifier 
V* for a stand-alone protocol (PVV). V* takes as input the common input x and an 
auxiliary string encoding the statq3 for Vq at the end of some phase i (the string also 
encodes i) of an interaction with Pq, and upon interacting with P produces as output 
another string encoding the state for V* at the end of phase i + 1. The zero-knowledge 
property of (P, V) then guarantees a simulator S for V*. 

We generalize the earlier notation for programming a function by recursively 
defining ro[P , . . . , i i+ 1] as (ro[£i , . . . , £i])[£i + i]. We may now specify the simulator 
for Vq as follows: on input (x, z). 


For simplicity, we may think of the string as encoding the transcript for the first i phases of 
interaction with Pq along with the random tape and auxiliary input for Vq . 
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- Set to = z 

- Run the simulator S for Q phases using the simulated oracle generated in 
the previous phase; that is, for i = 1 , 2, . . . , Q, 

- Output [t\ U . . . U iq, r Q )E 

We define Q(n) + 1 hybrids, Ho , . . . , Hq. The j’th hybrid is defined as the output of 
the following experiment: 

- Run Z RO ( 1") — > z, To = z. 

- Let the honest prover interact with the cheating verifier for j phases; that 
is, for i— 1,2, ... ,j, 


- For the remaining Q — j phases, run the simulator S using the simulated 
oracle generated in the previous phase; that is, for i = j 4* 1, • • ■ , Q, 

- Hj is (ro[^- +1 , . . . , £q),tq,z). 

Note that Ho and Hq correspond to simulated transcript and the actual transcript 
respectively. We need to show that Ho and Hq are computationally indistinguishable. 
Suppose on the contrary that this is not the case. Therefore, we have a nonuniform 
oracle PPT D that distinguishes two consecutive hybrid distributions Hj and Hj+i . We 
define an auxiliary input machine Zj that computes the interaction between P and V* 
for the first j phases: 


- Run Z RO ( 1") — ► z, to = z. 

- For i = 1,2, . . . , j, (P RO , lZ* R0 (rj_i))(x) — > n 

- Output (z, Tj). 


This allows us to rewrite Ho and H i as follows: 


- Run Z RO (l n ) — > (z,Tj). 

- S RO (x,Tj) — > (Tj-fl , £7+4) 

- for i = j + 2, . . . ,Q, 

- Output (ro [£ j+1 , ■ ■ .,t Q ],TQ,z). 


- RunZ®°(l") — > ( z,Tj ). 

- for i = j + 2, . . . , Q, 

- Output (ro [£ j+ 2 , • • • , £q\ , tq , z) . 


11 We abuse U slightly here; we want 1 1 U . . . U£q to denote the set satisfying RO[fi U. . . U £q] = 
RO [h,...,l Q \. 
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This is sufficient to contradict zero-knowledge of ( P , V ) for the j + l’th phaseQ The 
result for NPRO zero-knowledge follows as a special case corresponding to l\ = ... = 

Iq = %. □ 

Remark 2. One might expect a priori that the zero knowledge is preserved under a 
polynomial number of repetitions once we take into account oracle-dependent auxiliary 
information. However, we are only able to establish such a statement in the epro and 
NPRO models. Technically, the proof breaks down for fpro zero-knowledge because 
the simulator is not required to specify the simulated random oracle. In particular, this 
shows that sequential composition is more subtle than merely accounting for auxiliary 
input. A natural question that arises is whether prepending a random prefix to all oracle 
queries allows us to transform any protocol that is FPRO zero-knowledge into one 
that remains zero-knowledge under a polynomial number of sequential compositions. 
We note that using a random prefix only guarantees “independence” of the prover’s 
messages across different iterations; a cheating verifier is not limited to queries with the 
given prefix 13 

7 Proofs of Knowledge with Dependent Auxiliary Input 

Several constructions of zero-knowledge protocols begin with the verifier sending a 
proof of knowledge, for instance, that used in our counter-example, and the NPRO zero- 
knowledge protocol in 112 21 . If we allow the cheating verifier to receive an auxiliary 
input that depends on the random oracle, we would need to also extend the definition of 
proof of knowledge to incorporate auxiliary inputs that depend on the random oracle. 

Definition 2. Let (P, V) be an interactive protocol for a language L = Lr. We say 
that ( P , V) is a proof of knowledge w.r.t. dependent auxiliary input in the RO model 
(or auxiliary-input proof of knowledge) if for every oracle PPT P*, there exists an 
oracle PPT E such that for all nonuniform oracle PPT Z and for all x: 

Pr[Z R0 (ll a; l) -+ 2 ; E R0 (x, z) w, (x,w) € R] 

> Pr[^ R0 (l |x| ) (P* R0 (z),V R °)(x) accepts] -negfla;]) 

12 Unlike in the standard model, we cannot use an averaging argument to fix the output (z. Tj) 
from Zj. This is because the output depends on RO. We may eliminate the efficient-prover 
constraint in the lemma by allowing the auxiliary input machine Z in the definition of zero- 
knowledge to be unbounded, but we do not know how to achieve zero-knowledge without a 
bound on the query complexity of Z. 

13 Specifically, consider the trivial protocol for the language {0, 1}* wherein the prover sends 
nothing and the (honest) verifier always accepts. Note that using a random prefix does not 
affect this protocol in any way. Now, consider a cheating verifier that after each iteration 
outputs RO(0"). The zero-knowledge simulator for a single iteration (without dependent 
auxiliary input) may simply output a random string, but simply concatenating the output of 
this simulator for a polynomial number of times does not yield a correct simulation of the view 
of the cheating verifier for a polynomial number of iterations. This highlights the difference 
between simulating the transcript vs the output of the verifier, and the difficulty in ensuring 
“independence” of the random oracles amongst different iterations. 
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The next result implies a separation between zero-knowledge proofs of knowledge and 
zero-knowledge auxiliary-input proofs of knowledge. Specifically, we rule out non- 
interactive protocols that are simultaneously zero-knowledge and a proof of knowledge 
(in the above sense); otherwise, we can simply apply the knowledge extractor to the 
simulated proof to obtain a candidate witness. Some care is needed in arguing that the 
knowledge extractor works on the simulated proof, which uses a simulated random 
oracle and not the actual one. The reason why this approach only works for the new 
definition of proofs of knowledge is that we allow a cheating prover to receive oracle- 
dependent auxiliary input. In particular, the cheating prover may receive a convincing 
proof as auxiliary input, and the knowledge extractor can neither rewind the auxiliary 
input machine nor observe the oracle queries it makes. The proof is deferred to the full 
version. 

Theorem 6. Assuming the existence of one-way functions, there is a 2-round public- 
coin argument system for NP that is auxiliary -input EPRO zero-knowledge, and also an 
auxiliary-input proof of knowledge. On the other hand, only languages in BPP have a 
non-interactive argument system that is epro zero-knowledge and an auxiliary-input 
proof of knowledge. 

8 Circuit Obfuscation in the Random Oracle Model 

Let O be a probabilistic polynomial- time algorithm and let C be a family of circuits. Let 
A, S , Z, D be oracle PPTs. Given C £ C, we define diff°^°|’° 3 (C) to be the quantity 

Pr [z «- Z 0l (l |C| ); r «- A RO (0 RO (C)); D°%t,z)) = l] 
-Pr[z^Z 0l (ll c l); (t,£) <- S RO ’ c (z); D° 3 (r,z) = l] 

Definition 3 (circuit obfuscation 111*11111611 ). A probabilistic polynomial-time algo- 
rithm O is an obfuscator for the family of circuits C = U n C n in the fpro model 
(where C n is the subset of circuits in C that take inputs of length n) if the following three 
conditions hold: 

- (approximate functionality) There exists a negligible function a such that for all 
n, for all C £ C n , with probability 1 — a(n ) over the internal coin tosses of the 
obfuscator and over RO, O ro (C) describes a circuit with RO- gates that computes 
the same function as C. 

- (polynomial slowdown) There is a polynomial p such that for every circuit C £ C, 
\0(C)\<p(\C\). 

- (virtual black-box property) For every oracle PPT A, there exists an oracle PPT 
S such that for all C £ C and for all nonuniform oracle PPTs Z and D, 
diff e jfs Z D (C) is negligible (as a function of\C\). 


432 


H. Wee 


obf in 

FPRO j*— 


obf in EPRO : 


i obf in NPRO 

J 




exact 


! 




learnable 


aux-input i . 


aux-input 


aux-input 

i obf in fpro ‘ 


i obf in epro I 


i obf in NPRO 


Fig. 2. Relations between different variants of obfuscation in the RO model. An arrow is an 
implication, a double-tailed arrow is an equivalence, and a crossed arrow indicate separation. 
We stress that the relations refer to families of circuits that are obfuscatable according to the 
respective notions. 


We say that C is fpro obfuscatable if there exists an objuscator for C. In addition, we 
obtain: 

epro obfuscatable if Oi = c, O2 = RO, O3 = RO[f?] 

npro obfuscatable if Oi = c, O2 = RO, O3 = RO 

auxiliary-input fpro obfuscatable if Oi = RO, 0 2 = e, O3 = e 
auxiliary-input epro obfuscatable if Oi = RO, O2 = RO, O3 = RO [£} 
auxiliary-input npro obfuscatable if Oi = RO, O2 = RO, O3 = RO 

A point function I w is a boolean function that evaluates to 1 on input w and 0 
everywhere else. As observed in ED, to obfuscate I w in the RO model, we may simply 
pick a random a G {0, 1}H and store a, RO(a o w) in the obfuscated circuit, which on 
input x, outputs 1 iff ro(q; 01) = Ro(a o w). 

Theorem 7 (obfuscating point functions l(T9l ). There exists an auxiliary-input fpro 
objuscator for the class of point functions. 

One may expect some modification of the previous construction to yield an epro 
obfuscator for point functions, but this turns out to be impossible. The next result 
follows from a similar characterization in EJ for npro obfuscation: 

Theorem 8 (triviality). A family of circuits C = U n C n is epro obfuscatable ijfC is 
efficiently and exactly learnable using membership queries. 

Proof (sketch). Suppose C is efficiently and exactly learnable using membership 
queries. Consider an obfuscator that simply takes the input circuit C and outputs the 
circuit produced by the learning algorithm given oracle access to C\ the simulator does 
essentially the same thing. 

The learning algorithm for an EPRO obfuscatable family of circuits C is very simple. 
To evaluate C G C on input x (given oracle access to C and input x), run the simulator 
S for the trivial adversary A that merely outputs the obfuscated circuit to obtain (r, i), 
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answering queries to RO with random coin tosses, and then evaluate r on the input 
x using the simulated oracle ro[£]. This may be modified via standard techniques 
(specifically, we will need to amplify the soundness error via repetition and then take 
a union bound over all inputs) to yield a learning algorithm that on oracle access to C 
output w.h.p. a (standard) circuit that agrees with C on all inputs. □ 
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Abstract. A universally composable (UC) blind signature functional- 
ity demands users to commit to the message to be blindly signed. It 
is thereby impossible to realize in the plain model. We show that even 
non-committing variants of UC blind signature functionality remain not 
realizable in the plain model. We then characterize adaptively secure UC 
non-committing blind signatures in the common reference string model 
by presenting equivalent stand-alone security notions. We also present a 
generic construction based on conceptually simple Fischlin’s blind signa- 
ture scheme. 


1 Introduction 

Background. Since the introduction of blind signatures 0 vast number of 
papers are devoted to efficient constructions, security analysis, and extensions. 
Major applications include untraceable payment systems 0 and anonymous vot- 
ing 1 1 1 11 .'1| . The standard notions of security for blind signature schemes in the 
stand-alone setting are blindness and unforgeability [912211 8j . Universal compos- 
ability (UC) framework 0 offers security in more general setting where other 
arbitrary protocols are running concurrently. It asserts that the properties pro- 
vided by an idealized functionality retain even under general composition. A 
blind signature functionality is first suggested by Canetti in 0 and formally 
defined by Fischlin in EH with a round-optimal realization in the common ref- 
erence string (CRS) model. Kiayias and Zhou study adaptive security in . 

In known blind signature functionalities, e.g., jllll9j . a user commits to a 
message to request a signature. Then a signature is issued by the functionality 
remotely from the view of the signer. In EH , Fischlin pointed out that a UC blind 
signature protocol that realizes such a functionality implies a UC commitment 
protocol in the static corruption model and thus impossible to realize in the plain 
model 0 . A more formal argument is given by Lindell in [20121 j . A common idea 
for these arguments is that the existence of a simulator implies extraction of the 
input message and hence contradicts to the blindness. 

Is there a hope to circumvent the above impossibility if the functionality is 
relaxed by giving up the commitment property? In some applications such as a 
simple e-cash or a coupon system, every message can be a random string that 
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the users do not need to know or fix in advance. Such applications only concern 
blindness and unforgeability. In [2j, Buan, Gjpsteen, and Krakmo presented a 
non- committing blind signature functionality where corrupt users no longer de- 
posit messages. Thus there is no need to extract the messages for simulation. It 
was shown that such a non-committing blind signature functionality is realizable 
in the plain model and the presented security is equivalent to the unforgeability 
and weak blindness defined by Juels, Luby and Ostrovsky in |T%j . 

Our contribution. Somewhat contradictory, we show that universally com- 
posable non-committing blind signatures are still impossible in the plain model. 
Our proof shows that if the functionality provides blindness the presence of a 
simulator contradicts to the unforgeability in the plain model. Importantly, the 
positive result in j2| stands only in a restricted corruption model where the signer 
can be corrupted only after the key generation process. As stated in the paper, 
such a restriction is too strong that it is equivalent to incorporating a trusted 
party in the protocol. Our result holds for the most general corruption model. 
It is also pretty robust in the sense that it applies to wide variety of blind signa- 
ture functionalities that formulate blindness in a reasonable way like all existing 
functionalities do. 

Despite the negative result, non-committing blind signatures remain an in- 
teresting cryptographic object to study. The less demanding functionality would 
allow simple protocol designs in advanced models. This paper presents a thor- 
ough characterization of a non-committing blind signature functionality that is 
secure against adaptive adversaries without secure erasures. We prove that the 
properties captured by the functionality is equivalent to a pair of stand-alone 
security notions in the common reference string model, which are the standard 
unforgeability and a new strong notion of blindness which we call equivocal simu- 
lation blindness. We then decompose the equivocal simulation blindness to more 
handy notions called session equivocality and signature equivocality in a specific 
setting. We also show a generic construction. The simplicity of our framework 
can be highlighted when compared to the result on the adaptive security for the 
committing blind signatures El- 

Due to lack of space, most proofs are moved to the full version P , which also 
includes results in the static corruption model. 

2 Notations 

All algorithms in this paper run in polynomial-time in the security parameter A. 
By y «— A(x: r ) we mean that algorithm A is invoked with input x and uniformly 
chosen randomness r, and outputs something labeled as y. Randomness r may be 
omitted. By (a,b) <— { A(x),B(y )} we denote an execution of interactive Turing 
machines A and B on input x and y and with output a and b, respectively. When 
only one side of the output is of concern, we write a <— { A(x),B(y)) L for the 
left side and b *— ( A(x),B(y)) R for the right side. We write a[ui] <— A when A 
has some extra output u). The meaning of uj depends on the context and will be 
noted whenever this notation is used. For notations and notions related to the 
UC framework we refer to 0. 
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3 Blind Signature Schemes 

3.1 Syntax and Standard Security Notions 

A blind signature scheme BS in the common reference string model consists of 
five algorithms BS = BS.{Crs, Key, User, Signer, Vrf}. BS.Crs is a common refer- 
ence string generator. BS.Key is a key generator. BS.User is an interactive signa- 
ture request algorithm and BS. Signer is a signing algorithm. Interaction between 
BS.User and BS. Signer forms a signature generation protocol. BS.Vrf is a signa- 
ture verification algorithm. A blind signature scheme must provide completeness 
and consistency. Roughly, completeness is that for any (to, o) made faithfully 
through BS.Crs, BS.Key, BS.User, and BS. Signer, verification algorithm BS.Vrf 
outputs 1. Consistency is that BS.Vrf outputs the same value for the same input 
(even for keys generated by an adversary). We refer to [Bj for details and dis- 
cussions on these properties. Two standard security notions are unforgeability 
and blindness as shown below. 

Definition 1 (Unforgeability: UF ). A blind signature scheme BS is unforge- 
able if Succp.(A) = Pr[Forgef»(A) = 1] is negligible in A for any algorithm 
F* where Forge®® is the experiment shown below. F* can access to the oracle 
arbitrary number of times concurrently. 

Experiment Forge®® (A) : 

F «- fiS.Crs(l A ) 

(vk, sk ) <- BS.Key{F) 

((toiNi), • ■ ■ , {m k+1 ,o k+1 )) *- F* { '' BSS ' Bner( ' i: ’ sk)) (F,vk) 

Return 1 iff 

completed *— (•, BS.SigneifF, sk))n happens at most k times, and 

mi ^ rn j for all 1 < i < j < k + 1, and 

BS. Vrf[l 7, vk, m it of) = 1 for all 1 < i < k + 1. 

Strong unforgeability (sUF) is defined in the same way but requiring {mi, of) ^ 
{mj,oj) instead of rn, ^ mj, This paper focuses on the above relatively weaker 
notion as it suffices for major applications. 

Definition 2 (Blindness: BL ). A blind signature scheme BS is blind if 
Adv^.(A) = | Pr[Blindf®(A, 0) = 1] - Pr[Blind®® (A, 1) = 1] | is negligible in A 
for any algorithm B* where Blind®, is the experiment shown below. 

Blind®® (A, b) : 

F r- BS.Crs(l x ) 

{vk, too, toi) <— B*{F) 

o b <— ( BS.User{E,vk,mb), B*) l 

o \— b * (BS.UseifE, vk, m\- b ),B*) L 

If o o = T or oi = JL then set o 0 = o 1 = _L. 

Return b <— B*{oi,o 0 ) 
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For ease of notation, we represent algorithm B* as stateful so that it implicitly 
takes over its internal state from the previous invocation every time it is invoked 
by the experiment. Only new inputs are explicitly presented in the description. 
This convention is applied to all algorithms denoted with asterisk (*) throughout 
the paper. 

As observed in HZ!, the above definition captures the case where the adversary 
attempts to get useful information by aborting the sessions. H2J extends the 
notion in such a way that, when adversary B* is given at the end, it is 

given an extra piece of information that tells which session (the first or second 
or both) actually yields T in the user side. The results in this paper also holds 
with respect to the stronger notion of blindness. 


4 UC Non-committing Blind Signatures 

4.1 Functionality 

FigureQillustrates our non-committing blind signature functionality F n ch- In the 
figure, v is a deterministic signature verification algorithm. II is a description of a 
stateless signing algorithm. See jS] for remarks on running arbitrary algorithms in 
a functionality. As well as the ordinary signature functionality in |5j we formulate 

Q C b not to provide any security properties if an unregistered verification key 
is given as input to the signature generation and verification phases. See the 
discussion about the key management below. 

The idea of using counters to enforce the unforgeability is the same as that in 
f2J- Due to the difference of the timing that the counters are increased, our for- 
mulation can live with the general communication model thoroughly controlled 
by the adversary while the one in |2| needs authenticated communication in its 
realization. Note that the bare signature functionality in jS] can be realized with- 
out authenticated channel because there is no link between the public-key and 
the identity of the signer and it is not a matter who issues a signature as long 
as the signature is valid. 

Non- committing Property. Observe that input message m from a corrupt 
user is sent nowhere nor stored in the functionality. Thus S working on behalf 
of a corrupt user can complete the signature generation process whatever m is. 
This formulation results in avoiding the need of extracting the message from the 
corrupt users. 

Unforgeability. This property holds only while signer P s is honest. Counter 
C cmp i counts the number of completed signature generations in the signer’s side 
while counter CUaiid counts the number of valid signatures on distinct messages 
received by honest users with legitimate verification. The verification process ac- 
cepts signatures on new messages only if C cmp i > CUalid- From the specification, 
it is clear that C cmp i > C va iid always holds as long as the signer is honest. Thus 
unforgeability is guaranteed in the absolute sense. To capture weak unforgeabil- 
ity, C V a[id is incremented only for unique messages in the signature generation 
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Key Generation : Given (KeyGen. sid) from a party P s , verify that sid = 
( P s ,sid !) for some sid!. If not, then ignore. Else, forward (KeyGen, sid) 
to simulator S. Then, on receiving (Generated, sid, v, II) from S, send 
(Generated, sid, v) to P a and record (P s , v, IT). Let C c mp i = C va iid = 0, and 
r be empty. This phase must be completed only once and before other phases. 

Signature Generation : On receiving (Request, sid, ssid, v' , m) for some m from 
P u , send (Request, sid, ssid, v') to S and do the following. 

i. On receiving (Signed, sid, ssid) from S, forward it to P a . Set Ccmpl •<— 
Ccmpl + I- 

ii. On receiving (Received, sid, ssid) from S , do as follows: 

— If P u is honest and v' = v, then do as follows. If ( m , *, 1) 0 T, set 
Cvaiid «— C va iid + 1. Compute <7 <— I7(m) and record (m, < 7 , 1) to T. 
If (m, cr, 0) € r, send an error message to signer P s and halt. Send 
(Received, sid, ssid, a) to P u . 

— Else if P u is corrupt or v' v, ask S and forward P„ whatever received 

Signature Verification : On receiving (Verify, sid, ssid, v' ,m,cr) from some 
party P v , set ip = v'(m, cr) and do as follows. 

1. If v' / v, set f = ip. 

2. Else if (m, a, /') € r for any /', then set / = /'. 

3. Else if P s is corrupt or (m, *, 1) 6 T, then set f = p and record (m, cr, /) 
to r. 

4. Otherwise: 

(a) If Ccmpi > C„aiid, then set f = p and C„ a iid <— C„ a iid + /. 

(b) Otherwise, set / = 0. 

Then record ( m , a, f) to T. 

Output (Verified, sid, ssid, f) to P v . 

Player Corruption : On receiving corruption to P u , send all inputs and out- 
puts exchanged with P u to simulator S. Also send all randomness used in the 
evaluations of 77 with respect to P u ■ 


Fig. 1. Non-committing blind signature functionality P nr h 

process (see step (ii)). Strong unforgeability can be captured by removing con- 
ditions “if (m, *,1) ^ r” and “or (m, *, 1) £ T” from the signature generation 
and verification phases respectively. 

Completeness and Consistency. If the signer and a user are not corrupted 
and the registered key is given as input to the signature generation phase, 
(to, cr, 1) is recorded. The verification phase for such faithfully generated (to, cr) 
and registered v finds that record and always outputs / = 1. Thus complete- 
ness is captured. Consistency holds for free since algorithm v is deterministic. 
Limiting v to be deterministic loses generality but makes the exposition con- 
siderably simpler. For issues with respect to probabilistic verification algorithms 
see jhllil l 4j . 

Blindness. Important observations are; 1. 77 is fixed before any sub-session 
for signature generation starts, 2. 77 takes nothing but message to as input, 
and 3. Message m and 77 (to) are never sent to S or P s during the signature 
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generation phase. This formulation thereby assures that remotely computed a is 
independent of the signature generation viewed by the signer. Such a mechanism, 
which we call remote signing, is suggested in and employed by all known blind 
signature functionalities. 

On Key Management. The “bare” signature functionality in [4ltij is formu- 
lated in such a way that it stores a single public-key in every session and the 
security properties are guaranteed only for the registered public-key. The func- 
tionality enjoys concise presentation and high modularity. We take over his 
approach to define .Fncb- Namely, if unregistered v' is given as input to the 
signature generation or verification phase, -F rlc b behaves just as <S intended. So 
even though a user is honest, no security is guaranteed in such a case. (Re- 
call that the environment can pass arbitrary v' to an honest user.) Accordingly, 
upper-level protocols that uses F nc b must be responsible to provide registered v 
to the honest users. 

An alternative formulation would be to let ,F nc b to explicitly reject unregis- 
tered v' . It however results in incorporating a mechanism for distributing the 
correct public-key within the blind signature protocol. For instance, the proto- 
col realizing .Fncb may be constructed in F ca -hybrid model where F ca is the 
certificate-authority functionality jSj that serves only for the blind signature 
protocol. Though this kind of issue can be handled by the theorem of universal 
composition with joint state jHJ, we prefer F rlc b to be basic for the sake of higher 
modularity. 

In the literates, m implicitly follow the same approach as ours. They however 
define their functionality only for the case of receiving the registered public-key 
as input to the signature generation phase. It results in simpler presentation but 
eventually the details need to be provided with care. [OH shows more extended 
functionality such that it handles several public-keys under the same session-id 
and guarantees blindness for every set of signatures issued with the same public- 
key. This approach however suffers high complexity in its presentation. 
Variations. F nc b in Fig. 0 notifies only the end of the signature generation 
process to the environment. It can be extended so that the environment can 
give the signer explicit approval or denial for starting the process by adding 
another round of interaction among S, F nc b, and P s . It is also possible to let 
the environment know about the abnormal termination of the protocol in the 
same way. These modifications do not affect to the results in this paper since 
they can be incorporated only by modifying the protocol wrapper in Section 15.21 
accordingly. 


4.2 Impossibility in the Plain Model 

This section shows that F nc b cannot be realized without accessing to extra ideal 
functionalities or assuming some help from incorruptible parties. To make the 
statement meaningful, we consider non-trivial protocols where honest parties 
running the protocol with right inputs terminate and output something with 
noticeable probability. 
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Theorem 1. There exists no non-trivial protocol that securely realizes JF rlch in 
the plain model. 

Proof. We use S to extract the remote signing function 77 and use it to break 
the unforgeability in the real protocol. Recall that a forgery could never happen 
in the ideal model. Thus Z can distinguish the ideal process and a real protocol 
execution by observing a successful forgery. 

Suppose that there exists a non-trivial protocol n that realizes P nc b in the 
plain model. Recall that T^cb is invoked when it receives (KeyGen, sid) from a 
signer. It then outputs (Generated, sid, v ) to the signer. Protocol 7 r works in the 
same way since it realizes T-^cb- Let 7 tkg denote such a part of n that receives 
(KeyGen, sid) as input and outputs (Generated, sid, v). 

Consider a particular A* and Z* that behave in EXEC^.^-.^* as follows. Z* 
first asks A* to corrupt the signer. Z* then runs 7 Tkg with input (KeyGen, sid) 
and obtains (Generated, sid, v). (Here, without loss of generality, we assume 
that 7 Tkg can be run solely by the signer up to the moment (Generated, sid, v) 
is output. See the discussion after the proof for generalization.) Z* then sends 
(KeyGen, sid) and v to A* and receives (Generated, sid, v ) from A* working on 
behalf of the corrupt signer. Z* then asks a signature on a message m by sending 
(Request, sid, ssid, v,m) to an honest user. If A* is to join 7r on behalf of the 
signer to generate a signature, Z* takes over the role and completes the protocol 
by faithfully following 7r. The user eventually outputs (Received, sid, ssid, a). Fi- 
nally Z* sends ( Verif y, sid, ssid, v,m, a ) to a user and receives (Verified, sid, 
ssid, /) as a result of verification. Observe that, even though the signer is cor- 
rupted, Z* simulates an honest signer by following 7r. Furthermore, due to the 
completeness and terminating property of n, Z* can complete signature gener- 
ation with noticeable probability. If Z* completes, / = 1 appears at the end. 
Since 7r realizes !F nc b, there exists a simulator <S* for such A* and Z*. To suc- 
cessfully simulate A*, simulator S* has to send 77 to P n cb before Z* sends 
(Request, sid, ssid, v , rn) to an honest user. Furthermore, with noticeable prob- 
ability, 77 (to) must yield a valid signature accepted by protocol 7 r. 

Now we construct Z that distinguishes EXEC n ,A,z and IDEALjr ncb ,s ,2 by 
using above S* as a subroutine. Z first sends (KeyGen, sid) to the honest signer 
and receives (Generated, sid, v). Then Z starts simulating Z*. It asks S* to 
corrupt the simulated signer. Then it sends (KeyGen, sid) and v to S* and receives 
(Generated, sid, v, 77) from S* on behalf of P nc h- Now Z computes a <— 77(m) 
for some to. It then sends (Verify, sid, ssid, v,m, a) to a verifier and receives 
(Verified, sid, ssid, /). The output of Z is /. 

Let us evaluate Z. Suppose that Z is in EXEG^^z. Z simulates Z* per- 
fectly for S*. In particular v in this case is generated honestly by 7r just as 
Z* does. So S* outputs (Generated, sid, v, 77) as expected. Then with notice- 
able probability such 77 yields a that passes the verification protocol of 7r. Thus 
/ = 1 happens with noticeable probability in this case. Next suppose that Z is 
in IDEALjc- ncb 5 . 2 :. In this case, v is generated by S. If it is distinguishable from 
the one observed in EXEC^^.z, Z distinguishes EXEC^^^ and IDEAL^^s^ 
on that basis. If it is indistinguishable, S* outputs (Generated, sid, v, 77) as well 
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as in the previous case. Since no signature generation process is completed in 
IDEAL^t^s^ and !F nc h provides absolute unforgeability, rejects a gener- 
ated by 77. Thus / = 0 for this case. Accordingly Z distinguishes EXEC*-,^^ 
and IDEALjr nobi 5 )2 : with noticeable probability. I 

An essential point is that J~ n c.h demands S to extract 77 even from a corrupt 
signer for the sake of blindness. But the successful extraction of 77 contradicts 
to the unforgeability. The situation is very similar to the case of UC commit- 
ments [3 where the message from a corrupt committer must be extracted for the 
sake of binding property, and the successful extraction contradicts to the hiding 
property. 

The proof does not go through if protocol 7r involves incorruptible trusted 
parties or any extra ideal functionalities. The point is that Z* should be able to 
run 7 Tkg by itself so that the distribution of v is solely under the control. This 
allows Z to simulate Z* simply by sending v generated outside of Z. If 7 t«g 
involves parties other than the signer, Z* corrupts them before they send off 
any message and simulate them honestly by following 7 Tkg- When Z simulates 
Z *, these corrupted parties are simulated by following the behavior of the real 
uncorrupted players Z is working with. 


5 Characterization 


5.1 Blindness Based on Simulatability 

The following new notion called simulation blindness assures that the signature 
generation protocol can be executed without knowing the message. Similarly, 
the resulting signature can be generated without involving any information from 
the protocol run. To capture adaptive security, we require state reconstruction 
property. We use the term equivocal when a notion involves state reconstruction 
property. 

Definition 3 (Equivocal Simulation Blindness: EqSimBLND ). A blind 
signature scheme BS is equivocal simulation blind if there exists a set of al- 
gorithms SIM = SIM.{Crs, User, Sig, State} such that SIM. User and SIM. State 
can be stateful and SIM. Sig must be stateless, and advantage Adv^» (A) = 
|Pr[EqSimBL® s ,(A,0) = 1] - Pr[EqSimBL® s ,(A, 1) = 1] | is negligible in A 
for any D* , where EqSimBL®^ (A, b) is the following experiment. Oracles are 
accessible in arbitrary manner. 


EqSimBLf^ (A, 1) : 
2 - BS.Crs(l x ) 
vk <— D*(2) 

l ^ D *Ox(S,vk,.) 

Return b 


O x {E,vk,m) 

a «- (BS.Useif2,vk,m;r),D*) L 
Output (a,r) 
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EqSimBLff,(A,0) : 
(£,t) SIM.Crs(l x ) 
vk <— D*(E) 

ft £)*O 0 (S,vk,-,t) 

Return b 


Oo(S,vk,m,t) 

5[u u ] «- ( SIM.Useif£,vk,t),D*) L 
a[w e ] <- SIM.Sig(£,vk,m,t) 
r <— SIM.State(jjj u ,u) s ) 

If 6 = 0, then set a = - L. 

Output ( a,r ) 


Denoted by u u and ui s are the state information of SIM. User and SIM.Sig, 
respectively. 


Note that SIM. State is supposed to simulate the randomness even for the case 
where the interaction between SIM. User and D* is terminated abnormally. 
SIM. State can see how the interaction is terminated by seeing the state informa- 
tion oj u . 

It would be more useful if we could present separate notions of simulatability 
for simulating the view of sessions by SIM. User and the signatures by SIM.Sig. 
We call the notions session equivocality and signature equivocality. It is however 
not a proper way in general. Since SIM. User and SIM.Sig uses the same trapdoor 
as input and they may give negative influence each other when they are used at 
the same time. We thus consider a special case where trapdoors are separated 
like (ti,t 2 )) and SIM. User (and SIM.Sig) can be rim only with (and t 2 , respec- 
tively). With respect to the separate trapdoor generator we define two notions 
of simulatability. 

Definition 4 (Separable Trapdoor Generator) . SIM. Crs is a separable trap- 
door generator if it outputs (S,(ti,t 2 )) such that £ is indistinguishable from 
those generated by BS.Crs with negligible advantage, say Adv££, for any 
algorithm C* . 


Definition 5 (Signature Equivocality: SigEq). A blind signature scheme BS 
is signature equivocal if there exists algorithms SIM.Sig and SIM.SigState such 
that advantage function Adv^ eq (A) = | Pr[SigEQ®?(A,0) = 1] - Pr[SigEQ®? 
(A, 1) = 1] | is negligible in security parameter A for any A*, where SigEQ®? (A, b ) 
is the following experiment. 


SigEQ®. (A, b) : 

(£, (f lt f 2 )) <- SIM.Crs( 1 A ) 
vk <— A* (£) 

Return b 


Oi(E,vk,m,tf) 

a (BS.User(£, vk, m; ri\\r 2 ), A*) L 
Output (<7,ri||r2) 

Go(£,vk,m,tfl 

a[0] «- (BS.User{£,vk,m; ri||r 2 ), A*)x 
c/[u; s ] <— SIM.Sig{£, vk, m, f i) 
r\ - SIM.SigState(9,co s ) 

If cr = A,, then a' m jk* r[ = n . 

Output (cr', r*x | |r* 2 ) 


Symbol 6 is the transcript observed by BS.User, and u> g is a state information of 
SIM.Sig. 
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Definition 6 (Session Equivocality: SesEq ). A blind signature scheme BS 
is session equivocal if there exists algorithms SIM. User and SIM.SesState such 
that advantage function Adv^f^A) = |Pr[SesEQ§J(A,0) = 1] - Pr[SesEQ|? 
(A, 1) = 1] | is negligible in A for any algorithm E* , where experiment SesEQfS 
is the following. 

(BS. UseifE, vk, m; n|| r 2 ),E*) 

Return r 2 

Oo(E, vk, m, t 2 ): 

$[w u ] <- (SIM. User(£, vk, t 2 ),E*) L 
r 2 *— SIM.SesState(cj u ,m) 

Return r 2 

Oracle Ob receives a message m from E* and interacts with E*. Symbol uj u is 
the state information of SIM. User. 

In Definition El it is assumed that randomness r used in BS.User can be separated 
into two parts r\ and r 2 . An intuition is that r 2 is used while interacting with the 
signer and n is used after receiving the final message from the signer for computing 
the output signature. This treatment does not lose generality as one can set either 
part as empty. Regarding DefinitionElwe stress that the messages and the resulting 
signatures are not given to E*. Also note that trapdoor t% is given to E*. 

We now show relations between the standard blindness and simulation blind- 
ness. Since simulation blindness captures blindness in a very strong way, it seems 
natural that the following lemma holds. Proofs for the following lemmas are in (T|. 

Lemma 1 (EqSimBLND => BL ). If BS is equivocal simulation blind then it is 
blind. 

Proof is done in a standard way. We construct D* that successfully breaks equiv- 
ocal simulation blindness by using B* that breaks blindness. 

Regarding the reverse direction, we do not know if blindness solely implies 
simulation blindness or not. We however can show that there exists a scheme 
that is blind and unforgeable but not simulation blind. Namely, for the schemes 
that provide both blindness and unforgeability the simulation blindness is a 
strictly stronger notion than blindness. This implication is limited but sufficiently 
meaningful since we are interested in schemes that provide both blindness and 
unforgeability. Proof can be done in the similar way as that of Theorem [D 

Lemma 2 (BL A UF =£> EqSimBLND ). There exists BS that is blind and 
unforgeable but not equivocal simulation blind. 

The following lemma states that it suffices to consider simulatability about ses- 
sions and signatures individually when trapdoors are separable for each purpose. 

Lemma 3 (SesEq A SigEq =>■ EqSimBLND ). If BS has a separable trapdoor 
generator and is signature equivocal and session equivocal with respect to the 
generator then BS is equivocal simulation blind. 


SesEQ|? (A, b ) : 

(E, (t!,t 2 )) <- SIM.Crs(l x ) 
vk <— E*(E, tf) 
l^ E *o h (s,vk,.M) 

Return b 
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Proof is done through 



starting from 


EqSimBL® s , (A, 1) to EqSimBL® s „(A, 0). 

5.2 Protocol Wrapper WrapQ 

In Fig. □ we show how to transform a blind signature scheme BS into a blind sig- 
nature protocol by applying a simple wrapper algorithm, WrapQ. The resulting 
protocol Wrap(BS) is in the ,F crs -hybrid model where i s the CRS generation 
and distribution functionality whose output distribution is defined by BS. 


Blind Signature Protocol Wrap(BS) in ,F crs -model 


Key Generation: Upon receiving (KeyGen, sid) from the environment Z, a 
party P s verifies that sid — ( P 3 ,sid! ) for some sid'. If not, do nothing. Else, 
P s derives CRS E from P C ia, computes ( vk,sk ) <— BS.Key(X') and outputs 
(Generated, sid, v) where v(m, a) = BS.Vrf (E, vk, a, m). 

Blind Signature Generation: Party P u and P s do the following. 

P,-side: On receiving (Request, sid, ssid, v' , m) from Z, derive E from P C is, 
send (Request, sid, ssid, v') to P s , invoke BS.User(X', vk' , m), and inter- 
act with P, . Take vk' out from v' . If BS.User outputs a such that 
BS.Vr f(E,vk' ,a,m) = 1, then output (Received, sid, ssid, a). 

P,-side: On receiving (Request, sid, ssid, v') from a user P u , get E from 
Per* , invoke BS.Signer(IA sk) and interacts with P u . If BS. Signer outputs 
completed, then output (Signed, sid, ssid). 

Signature Verification: On receiving (Verify, sid, ssid, v' , m, a) from Z, a party 
/’, derives E from P C rs, takes vk' from v' , computes / <— BS.Vrf(T', vk', a, m), 
and outputs (Verified, sid, ssid, /). 

Common Reference Functionality P m 
CRS Generation: On receiving (CrsGen, sid), T" crs computes E *— BS.Crs(l A ) for 
the first time and returns E. Simply return the same E for further requests. 


Fig. 2. UC blind signature protocol transformed from stand-alone scheme BS 

Note that the resulting protocol does not implement any mechanism to verify 
the given verification algorithm v' . It works as intended if v' = v but no security 
is guaranteed for the user if v' ^ v. Also note that the signer ignores v' given 
from the user and uses the genuine secret key sk. 

5.3 Equivalence 

Theorem 2 (UF A EqSimBLND <=>■ P nc b )• Protocol Wrap(BS) securely realizes 
Pncb with respect to adaptive adversaries if and only if BS is unforgeable and 
equivocal simulation blind. 


“If” direction is proven by constructing a simulator, S , that uses A as a black- 
box. To run A properly, S simulates entities and their communication in 
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EXEC^jJ^. We then apply the game transformation technique starting from 
IDEAL^r ncbi 5 ^(A, a) as Game 0. Game 1 removes the use of simulation algo- 
rithms SIM.Crs, SIM. User, SIM.Sig, and SIM. State from ,F ncb and S. The differ- 
ence is negligible due to the simulation blindness. Game 2 then modifies the 
verification process of -F nc b so that it no longer care for the counters. This modi- 
fication is justified by the unforgeability. Game 3 further modifies the verification 
process so that it completely follows the verification function. Justification is due 
to the completeness and consistency. Game 4 then modifies .F ncb so that it does 
not record the signed messages any more. It is justified by the completeness and 
consistency again. Finally, Game 5 removes unused actions in iF nc h and S. This 
is just cosmetic to make sure that T n cb and S do nothing but executing the real 
protocol. Thus Game 5 is equivalent to EXEC^ z (A,a). 

“Only if” direction is more intricate. First, assuming that BS is not simulation 
blind, we show that, for any S, there exists successful Z. Second, assuming 
that BS is simulation blind but forgeable, we construct successful Z that is not 
fooled by any S. For the first part, we construct simulation algorithms SIM.Crs, 
SIM. User, SIM.Sig and SIM. State by using S as a subroutine. For such simulation 
algorithms there exists adversary D* that breaks simulation blindness since we 
assumed that BS is not simulation blind. Then we use such D* to construct 
Z. A tricky issue in constructing these simulation algorithms is that they do 
not share the internal state. Since individual copy of S is run independently in 
these functions, it would output different CRS-es and public-keys. Our idea is 
to use the trapdoor as a container of the randomness given to S so that every 
simulation algorithm can give the same randomness to S. In this way, every copy 
of S works on the same CRS and public-key so that all simulation algorithms 
work consistently. A formal proof is given in . 

6 A Generic Construction 

6.1 Overview 

Our starting point is the “basic” blind signature scheme by Fischlin [H]- In 
his scheme, a user commits to message to by sending a commitment c and the 
signer returns a bare signature s on c. Then the user computes a final signature 
cr which actually is a non-interactive zero-knowledge proof of knowledge about 
the message m and the valid signature s. Unforgeability is based on the binding 
property of the commitment and the unforgeability of the bare signature scheme 
and the knowledge soundness of NIZK. Blindness is from the hiding property 
of the commitment scheme and the zero-knowledge property of NIZK. By BSg 
we denote this generic scheme. When transformed by our wrapper, Wrap(BSc) 
securely realizes non-committing blind signature functionality !F n cb with respect 
to static adversaries. (See 0 for details.) It is a surprise that such a conceptually 
simple scheme can provide universal composability even though the adversary is 
limited to be static. 

An essential issue to handle adaptive security is the state reconstruction. 
Looking at the structure of BSg, the session equivocality can be easily achieved 
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by replacing the commitment scheme with a trapdoor commitment scheme. (In 
fact, with such a small modification to BSg, the resulting Wrap(BSc) provides 
adaptive UC security in the erasure model.) On the other hand, the signature 
equivocability is not generally possible there. Recall that a signature is simulated 
by the zero-knowledge simulator. It therefore can be the case that there exists 
no randomness that is consistent to a real witness. To overcome this problem, we 
consider eliminating the use of zero-knowledge simulator by providing a correct 
witness to the proof system through the simulation of the bare signature in the 
signer-side. Namely, we make the signer’s signing algorithm to be simulatable 
by using a signature scheme in the CRS model so that valid signatures can 
be created with the trapdoor of the CRS. In this way, we can always provide 
a witness to the proof system used in the user-side algorithm. Now, witness 
indistinguishability of the proof system assures that the same proof could have 
been created from any other witnesses. Accordingly, a consistent randomness 
always exists. This particular structure is suggested in |Hj for the purpose of 
removing the CRS in the stand-alone model. We will take advantage of the 
structure for achieving adaptive security. 

6.2 Building Blocks 

— NIWI (TV on-interactive Witness Indistinguishable Proof System). It is a non- 
interactive witness indistinguishable proof system of knowledge when the CRS 
is generated in the regular way. By NlWI.Crs, NlWI.Prf and NlWI.Vrf, we denote 
the CRS generation function, the proof generation function and the verification 
function, respectively. Additionally it must allow state reconstruction when the 
CRS is simulated. Namely, one can reconstruct a consistent randomness for a 
given witness and a valid transcript. The Groth-Sahai proof system fTTij . the 
GS proof system for short, meets these requirements under SXDH or DLIN 
assumption. It unfortunately does not work for any NP statement but works 
efficiently for relations represented by bilinear products. We thus need to choose 
other building blocks so that they fit to the GS proof system for instantiation. 

— TC ( Trapdoor Commitment Scheme). It is a standard trapdoor commitment 
scheme. By TC.Key, TC.Com and TC.Vrf, we denote the key generation function, 
the commitment function, and the verification function. There are two more 
functions such that one generates a random commitment and the other opens the 
commitment to an arbitrary value by using the trapdoor generated by TC.Key. 
See P an instantiation that works well with GS proof system under the SXDH 
assumption. 

— SSIG ( Simulatable Signature Scheme). It is a signature scheme in the CRS 
model with a special property such that valid signatures can be computed from 
the public-key and the trapdoor bind the CRS. By SSIG.Crs and SSIG. Key, we 
denote the CRS generation function and the key generation function. SSIG. Key 
takes the CRS and outputs a signing key and a verification key. Besides the 
signature generation function SSIG. Sign, there is a signature simulation func- 
tion SSIG.Sim that generates valid signatures by using the public-key and the 
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trapdoor generated by SSIG.Crs. It is stressed that the simulated signatures must 
pass the verification by the verification function SSIG.Vrf but it is not demanded 
that they are indistinguishable from the real ones. Similarly, unforgeability is 
the standard unforgeability against chosen message attacks. In particular, the 
adversary is not given simulated signatures. 

Any standard signature scheme can be turned into a simulatable one in an 
unconditional way as follows. Generate two key pairs by running the key gener- 
ation algorithm twice independently. The first key pair is used as the CRS and 
the trapdoor while the second pair is used as the verification and signing key. 
Normal signing is done by using the second key. Simulation is done by the first 
key. A signature is accepted if it passes the original verification predicate with 
respect to either of the keys. 

To fit to the other building blocks, SSIG must be able to sign group elements 
and the verification predicate must be represented as a product of pairings. For 
such a signature scheme a feasibility result based on DLIN assumption can be 
seen in [T5] . 


6.3 The Scheme 

The CRS generation function BS.Crs computes (T wi , t wi ) <— l\IIWI.Crs(l A ), 
(T tc , £ t c) TC.Key(l A ), and (i7 SS i g , t ssig ) <— SSIG.Crs(l A ), and outputs £ = 
(£ w \,£b c , Tssig). Key generation function BS.Key is the same as SIG.Key, which 
outputs vk and sk. The signature generation protocol is illustrated in Fig. [3 The 
proof system NIWI proves the following relation between witness w = ( s,c,z ) 
and instance x = (vk. £ tc , T ss j g , m): 

TC.Vrf(J7 tc , c,m,z) = lA SSIG.Vrf(i7 SS i g , vk, c, s) = 1 

Verification function BS.Vrf takes ((£ WI , £ tc , £ SS i g ),vk,a,m) as input and out- 
puts i p e {0, 1} such that p <— NIWI.Vrf(I7 w ;, (vk, £ tc , T ssig , m),a). 


| Signer P s | S m 

| User P„ | 

BS.Signer(T, sk) (V wi , V tc , V ssi( 

,) BS.User(X', vk, m) 

c 

(c,z) ^TC.Com(V tc ,m) 

s <- SSIG.Sign(r ssig , sk,c) 

Output completed. 

If SSIG.Vrf(T ssig , vk, s, c) / 1 output T. 
cr <- NIWI.Prf(V w i, x, w) where 
x = (vk, Etc, V ssig , m) and 
w = (s,c,z). 

Output cr. 


Fig. 3. Generic blind signature scheme BSs- The signature generation protocol. 
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Theorem 3. Protocol Wrap(BSs) securely realizes P nc b in the JU crs -hybrid, model 
with respect to adaptive adversaries without erasures. 

We claim that the scheme is session equivocal and signature equivocal. Observe 
that setting t 1 = (t w i,fssig) and t 2 = (ttc) forms separated trapdoors. Session 
equivocality is proven by constructing SIM. User and SIM.SesState by using the 
trapdoor property of TC. Signature equivocality can be shown by constructing 
SIM.Sig and SIM.SigState by using the simulation property of SSIG and state 
reconstractability of NIWI. Thus from Lemma 0 we can say that the scheme 
is equivocal simulation blind. We then argue that the scheme is unforgeable 
due to the binding property of TC, the unforgeability of SSIG and the proof of 
knowledge property of NIWI. Finally Theorem £3 is applied to complete the proof 
of Theorem |3 
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Abstract. Following the cryptanalyses of the encryption scheme HFE 
and of the signature scheme SFLASH, no serious alternative multivariate 
cryptosystems remained, except maybe the signature schemes UOV and 
HFE . Recently, two proposals have been made to build highly efficient 
multivariate cryptosystems around a quadratic internal transformation: 
the first one is a signature scheme called square-vinegar and the second 
one is an encryption scheme called square introduced at CT-RSA 2009. 

In this paper, we present a total break of both the square-vinegar 
signature scheme and the square encryption scheme. For the practical 
parameters proposed by the authors of these cryptosystems, the com- 
plexity of our attacks is about 2 35 operations. All the steps of the attack 
have been implemented in the Magma computer algebra system and al- 
lowed to experimentally assess the results presented in this paper. 


1 Introduction 

There are mainly two motivations behind the construction of multivariate cryp- 
tosystems. The original one is to provide alternatives to the asymmetric schemes 
RSA and those based on Discrete Logarithm problems which are connected to 
number theoretic problems. Multivariate cryptosystems are instead connected 
to the hardness of solving randomly chosen systems of multivariate equations 
over a finite field, a problem which is NP-complete even in the case of quadratic 
polynomials defined over GF(2) when there are at least two such polynomials 
in the system. Moreover, this problem seems to be hard not only for very spe- 
cial instances but also on the average. Another incentive to develop multivariate 
cryptosystems is the expected efficiency that they might offer, a property that 
would be highly appreciated for constrained environments such as RFIDs and 
other embedded devices. Finally, some people argue about the fact that, contrary 
to the problem of factorisation and that of solving discrete logarithms |23!, no 
quantum algorithm is known for the problem of solving sets of randomly chosen 
multivariate equations. 

After the introduction of the C* cryptosystem by Matsumoto and Imai in 
fl Mil til . there have been several other proposals. Among the most famous ones 
are certainly HFE (Hidden Field Equations) and SFLASH which can be thought 
of as two ways of generalising the C* scheme. Some heuristic design principles 
have followed. A major one, which has been originally suggested by Shamir 
in EH, is to remove some equations from the public mapping in the case of sig- 
nature schemes; this principle has proven to be successful in thwarting Patarin’s 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 451-|l68,|2009. 
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attack m against C* (an attack that can be viewed as a preliminary to Grobner 
basis attacks). Another one consists in adding a new set of variables to perturb 
the analysis as in the UOV (Unbalanced Oil and Vinegar) signature scheme |H|. 

Two of the most promising proposals, SFLASH and HFE have been cryptanal- 
ysed during the last years. Some HFE instances have been shown to succumb to 
Grobner basis attacks in 0 and the complexity of such attack has been argued 
to be quasi-polynomial in SFLASH has been entirely broken: the missing 
equations (due to the minus transformation) can be recovered in most cases as 
explained in jS| and the secret key of the resulting C* scheme can be recovered 
following the cryptanalysis described in [HU - In this context, two new proposals 
were based on internal transformations that are not only quadratic on the base 
field, but also on the extension field: a signature scheme called square-vinegar 
was proposed in j2| and an encryption scheme called square appears in ^j. 

Our Results. In this paper, we expose a total break of both the square- vinegar 
signature and the square encryption proposals from a theoretical point of view 
as well as from a practical point of view. We indeed describe how to recover an 
equivalent secret key for both cryptosystems given the public key alone. For the 
parameters recommended by the authors, our attacks complete in a few minutes 
on a standard PC. These cryptanalyses also represent a theoretical break of the 
schemes as, under some reasonable assumptions, their complexity is shown to 
be polynomial with respect to the security parameter: the attacks have a time 
complexity of O (log 2 (<?)n 6 ) since they rely on standard linear algebra on n 2 
unknowns over a finite field of size q and n is typically small because the time 
complexity of the public computation (signature or encryption) is 0(n 3 ). The 
attacks are sequences of steps including the discovery of new algebraic invariants 
leaking from the public key, a careful analysis of these invariants to sort out vine- 
gar unknowns from the standard ones. We additionally implemented Magma 0 
programs that were used to verify each of the steps of the cryptanalyses and to 
perform the attacks against the different sets of parameters recommended by the 
designers of the square encryption and square- vinegar signature schemes. Their 
source code is given in the appendix. 

2 The Square Cryptosystems 

The square cryptosystems are based on design ideas taken from both the HFE 
cryptosystem and the UOV cryptosystem. However, an important property of 
the square cryptosystems is that they are defined over fields of odd characteristic: 
as their internal transformations are quadratic, the systems would be linear over 
fields of characteristic 2. We begin by a brief reminder on HFE and UOV before 
proceeding to the description of the square cryptosystems themselves. 

2.1 The HFE Cryptosystem 

The HFE cryptosystem has been proposed by Patarin in [THj as a possible gen- 
eralisation (and strengthening) of the C* scheme proposed by Matsumoto and 
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Imai in [E3- Indeed C* was broken by Patarin m whereas the best attack 
against HFE are Grobner basis attacks which complexity was argued to be 
quasi-polynomial m- HFE is called hidden field equation because its internal 
transformation is kept secret. This internal transformation F is defined over an 
extension E of degree n over some base field F, ; and is chosen to be F g -quadratic: 

F:X » g + J2 PkX qk , (1) 


where the coefficients a^j, (3k, and 7 lie in E and D is an upper bound to the 
overall degree to make it practical to invert F through factorization. Since F is a 
Fg-quadratic mapping, it can also be expressed over F, ; as an n-tuple (/i, . . . , f n ) 
of quadratic polynomial mappings in n unknowns and so can the composition 
ToFo S for any pair of one-to-one affine mappings S : F£ — * E and T : E — > F”. 
In the case of HFE, the mappings S and T are kept secret and together with F, 
constitute the secret key, whereas the public key is the mapping G = T o F o S. 
In order to decrypt, the legitimate user applies the inverse of T, finds roots of 
the univariate polynomials on the extension field E and applies the inverse of S 
to each of these roots. The plaintext is one of the roots which can be singled 
out by using some redundancy. In this decryption process, the knowledge of the 
secrets S and T is crucial. 

Additionally, Shamir’s proposal to remove some (say r) of the n polynomials 
that constitutes the public key can be applied in the case of a signature scheme: 
indeed, to sign a message (j/i, . . . , y n -r), the signer first completes the message 
with random values y n -r+ 1 , • • • , y n and “decrypts” it normally. This operation 
is called the minus transformation and is used in the square- vinegar scheme. 

With these notations, C* is similar to HFE (with an unbounded total degree) 
where all coefficients of the internal transformation are set to zero but ao,e for 
a well chosen 6. SFLASH in turn P , is the original C* scheme with the minus 
transformation applied. 

2.2 The UOV Signature Scheme 

Another ingredient in the design of the square-vinegar signature scheme is the 
use of additional unknowns meant to harden the analysis of the scheme by trying 
to break the structure used during the decryption process. Such an idea was first 
proposed in the oil and vinegar signature scheme. This scheme uses two sets of 
unknowns (aq , . . . , x n ) and (z\, . . . , z v ) respectively called the oil and the vinegar 
variables. The internal transformation then consists of an n-tuple of polynomials 
F = (J\ , . f n ) of the special form: 

/<<*.*) = E +E 7<«+JC, ’ (2) 

where aij, Si,j , and e are randomly chosen from the base field ¥ q . The aq 

are called oil variables because they do not mix, i.e. there is no cross-term XiXj. 
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Vinegar variables Zi in contrast, mix with other vinegar variables as well as 
with oil variables. The fact that the coefficients of the polynomials are chosen 
randomly is satisfactory since the resulting polynomials look closer to randomly 
chosen ones. However, the two types of variables makes it possible to create a 
signature scheme: in order to find some pre-image y = (t/i, . . . , y n ) through F 
the signer first draws some random values for zi, . . . , z v and substitutes them in 
the description of F. The resulting set of polynomials becomes linear in the oil 
unknowns x% and the associated nx n linear system (with y as right member) is 
easily solved: about i of the time, the system has a solution (a 1 , . . . , a n ) which 
makes (a, z) a pre-image of y through F and otherwise another choice for z is 
made until there is a solution. Obviously, this structure has to be hidden from 
the view of an attacker and the public key is the composition G = F o S where 
5 : is a one-to-one affine application. 

The message size over signature size for the UOV signature scheme is not 
optimal since the number of vinegar unknowns must be at least twice big as the 
number of oil unknowns for it to be secure I'Till fill 4| . 


2.3 The Square-Vinegar Signature Scheme 

The square-vinegar signature scheme strives to provide an efficient alternative 
to UOV or HFE with the minus transformation applied. Let F g be a finite field 
and E be an extension of degree n over F g . The internal transformation of the 
square- vinegar scheme is defined as: 

F : ExF” — > E , (X, X v ) i— > aX 2 + 0(X v )X + y(X v ) , (3) 

where a is a constant randomly chosen from E, : F” — > E is a randomly chosen 
affine application, and 7 : F" — > E is a randomly chosen F g -quadratic application. 
This internal transformation is hidden by two full rank affine applications S : 
K n+V IE anf ] 7 1 ; E — > F”. Therefore S mixes the vinegar unknowns X v with 
the “normal” unknowns X. In addition to T, a projection 17 is applied where r 
of the n components have been removed as in SFLASH or HFE . The affine 
transforms S and T together with the applications 7, /3, and the constant a 
constitute the secret key. The public key P results from the composition of the 
three applications: P = FIoToFoS. 

The use of an odd characteristic base field is advertised by the authors as 
a means to thwart Grobner bases attacks since introducing the corresponding 
field equations in the computation renders it unpractical. Mixing the vinegar 
unknowns with the normal ones breaks the algebraic relations between the input 
and the output that appeared in C* (bilinear relations FT71 ) or HFE (algebraic 
relations of higher degree, as explained in j7H2| h Eventually, just as for HFE , 
removing part of the output information further mitigates Grobner bases attacks 
and prevents Kipnis and Shamir’s attack developed against UOV. 

Signature. The signing process is highly efficient. It only requires the holder 
of the secret key to randomly pick r elements from ¥ q to complete the mes- 
sage (mi , . . . , m n - r ) to be signed into m = (mi , . . . , m„_ r , m„_ r+ 1 , . . . , m„) 
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and to invert the public application in three steps: S -1 o F~ l o T _1 (m). Ap- 
plying T -1 and S -1 is a matter of multiplying with precomputed matrices and 
inverting F requires to find the roots of a quadratic univariate polynomial over E. 
In case there is no solution, the signer restarts the process by choosing another 
way of completing the message m into to. 

2.4 The Square Encryption Scheme 

A companion scheme to this square- vinegar signature scheme has been proposed 
in £1] . The square encryption scheme strives to provide an efficient and secure 
alternative to HFE and, as the square-vinegar scheme, has a square internal 
transformation: F : E -* E, I h X 2 . The parameters are chosen so that the 
size of the base field verifies q = 3 mod 4 and the degree n of E over ¥ q is 
odd. The transformation F is again hidden by two full rank affine mappings 
S : F” -r — > E and T : E — > F”, which yields a public key P = T o F o S. 
(Following |S|, the authors proposed to fix r of the input unknowns to a pre- 
defined value (say, zero) to prevent the attacker from controlling the differential 
of the public key as in Dubois, Fouque, Shamir, and Stern’s cryptanalysis jH|.) 
This scheme is somewhat reminiscent of the C* scheme, where F(X) = X q +1 
for a well chosen 6. But for the square encryption where 6 = 0 , the bilinear 
relations XY q = X q Y between X and Y = F(X) boils down to the tautology 
XY = YX. The embedding S aims to finish hiding the algebraic structure of 
the internal transformation. 

Decryption. The secrets’ holder is able to decrypt very efficiently: in addition 
to finding pre-images through T and S which amounts to solve simple linear sys- 
tems, the decryption process requires to compute a square root in the extension 
field E. Computing the square root is done by the square and multiply algorithm 
since q n = 3 mod 4. As there are two possible square roots, the 
right one is singled out as the one lying in the image of S. 

3 Cryptanalysis of the Square-Vinegar Signature Scheme 

We now describe a generic and very efficient attack against the square-vinegar 
signature scheme. Our attack proceeds in three steps: We first exhibit an in- 
variant of the internal transformation and recover it through the analysis of the 
differential of the public key; Then, we use this information to recover an equiva- 
lent representation of the vinegar space; In a third step, we transform the public 
key into a special shape that allows us to invert it efficiently. Put together, these 
three steps allow us to forge a signature for any given message. 

3.1 Alternative Decompositions 

Recall that the internal transformation of the square- vinegar signature scheme 
has the following structure: 

F : ExF” — ► E , (X, X v ) i — > aX 2 + /3(X V )X + 7 (X V ) , 
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where a is a constant, (3 : F g — ► E is an affine F g -linear mapping, and 7 : F g — > E 
is a F g -quadratic mapping, where E is an extension of degree n over F g . The 
public key is the mapping P = II o T o F o S, where II is a projection that 
removes r polynomials, S : F™ +1 ' -tExFJ and T : E — ► F g _r are two affine 
linear mappings of full rank. The decomposition (T, F, S) of the pubhc key is 
kept secret. 

A major component of the internal transformation F is the mixing of vinegar 
unknowns with X. It makes it harder for an attacker to use the specific structure 
of a univariate quadratic polynomial of F viewed as a function of X. A crucial 
remark is that there exist linear mappings that, when composed with the internal 
transformation, not only conserve its special form, but also discard the part of F 
mixing the vinegar X v with X . Indeed, consider the mappings a : (X. X v ) 1— > 
(X— fp(X v ),X v ) and r :Y 1— > j^Y. (Remember that the scheme is defined over 
a field F g of odd characteristic.) It can be checked that these mappings provide 
an alternative decomposition (T o r, F, a o S ) of the public key such that 

F: (X,X v )^X 2 + j{X v ) , (4) 

where 7 is a F g -quadratic mapping. We stress here that an attacker does not need 
to know the mappings er and r but rather assumes without loss of generality that 
the public key follows the specific decomposition 0) . (Also note that in a similar 
fashion, keeping secret the defining polynomial of the extension has no effect: 
as two fields of the same size are isomorphic and the isomorphism is a linear 
bijective application, any arbitrary choice made by the attacker is “absorbed” 
in S and T.) This last decomposition can be further tweaked as in HU to remove 
the affine parts of the mappings S and T but at the expense of reintroducing a 
linear term in X, leading to an internal transformation of the following shape: 

F' : (X, X v ) h- X 2 + ftX + i{X v ) , (5) 

where ft is a constant from E and 7' is some F g -quadratic mapping. In the 
following sections, the attacker can therefore just assume wlog that the public 
key is decomposed as ( T',F',S ') where S' and T' are linear mappings, and F' 
is as given in ©: then ( T',F',S ') contains enough information to forge valid 
signatures and thus constitutes an equivalent secret key. We call such a decom- 
position a “split decomposition” (the unknowns X and X v are now separated 
in the internal transformation). A split decomposition is not unique: iterates of 
the Frobenius mapping ip : z i-> z q and multiplications A u : z 1— > uz, u £ E, 
do not alter the prescribed shape of the internal transformation (though coef- 
ficients might change); In particular, if (To, Fq, Sq) is a split decomposition, so 
are (T 0 o A u ~2 , A u 2 o Fq o A u ~ 1 , A u o So) and (T 0 o tp l o F 0 o ip 1 o So)- 

3.2 Using the Multiplicative Property of the Differential 

In the previous section we showed how to discard the cross-contribution of X v 
and X. However, the contribution 'y(Xy) still disturbs the algebraic properties 
of the univariate quadratic in X. In order to circumvent this difficulty, we make 
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use of a tool first introduced by Fouque et al. in jO] that proved very useful 
in attacking multivariate cryptosystems: the differential of the public mapping. 
The differential of P in a is defined as: T)P a {x) = P(x+a) — P(x) — P(a) + P(0). 

In the case of an F g -quadratic mapping, DP Q is such that (x, a ) DP a (x) 
is a symmetric bilinear mapping. From now on, we denote by DP this bilinear 
mapping and call it differential of P. The differential map corresponding to the 
internal transformation X i— > X 2 + QX + j(X v ) of a square-vinegar instance is: 

DF((X,X V ),(Y,Y V )) =2XY + Dy(X v ,Y v ) . (6) 

The success of the attack lies in the fact that normal (X) and vinegar ( X v ) 
unknowns are separated in the expression of the differential DP. More precisely, 
the only linear mappings L such that for all (X. X v ) and all (Y. Y v ): 

T>F((L(X),X V ),(Y,Y V )) -DF((X,X V ),(L(Y),Y V )) =0 L(X)Y = YL(X) 

are Z i-> A Z for A £ I. Indeed, any solution L : Z J2i<i< n ^ 9% verifies 

£l<i<„ l i XYQi = El <i<n l i Xqiy for aU X and E and since T X > Y ) ^ Xqiyqi 

forms a basis of the space of bilinear forms we must have k =0 for all i > 0. 

In addition, we conjecture that with very high probability (with respect to 
the uniform choice of the coefficients of 7) the only linear mappings L verifying 

MX V MY V D7 (L(X v ), Y v ) - Dj(X v , L(Y V )) = 0 

are Z v 1— > cZ v for some c G F g . This might be heuristically justified by the fact 
that the random choice of 7 does not allow such an algebraic property to appear, 
and is verified experimentally. Assuming this conjecture is true, we have: 

Proposition 1. For a random instance of the square-vinegar scheme, it happens 
with very high probability that the only linear mappings L verifying: 

V(X,X v )\/(Y,Y v ) DF(L(X,X V ),(Y,Y V )) - DF((X,X V ),L(Y,Y V )) = 0 (7) 

are ( Z , Z v ) i-> (A Z, cZ v ), where A e E and ce¥ g . 

Proof. Write L : (Z, Z v ) i-> (AZ + CZ V ,CZ + BZ V ) for some solution of (0). 
Since the equation holds for all inputs of DP, consider it specialised at X v = 0 
and Y v = 0, with DP replaced by its expression Q: 

VWVT [2A{X)Y + T> 1 {C{X),Q)] - [2XA(Y) - D 7 (0, C(F))] =0 . 

As D 7 (*,0) = 0 and D 7 (0,*) = 0 for any V, this gives A(X)Y = XA(Y) 
which, as we saw above, implies A : Z i— > A Z for A £ E. Similarly, at X = 0 and 
y = 0, ® becomes: VX v YY x V~,(B{X v ).Y v ) - D 1 (X V ,B(Y V )) = 0, implying 
B : Z 1— > cZ for c £ F g by conjecture. Finally, at X = 0 and Y v = 0, (0 becomes: 
\/X v VY D 7 (A„, C'(^)) = 2 C(X V )Y. Assume for a contradiction that C is not 
identically null. Then setting X v = X\ such that C(x\) ^0, the right hand side 
spans a vector space of dimension n while the left hand side spans a vector space 
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of dimension at most v. Hence, when v < n as in a, square-vinegar instance, 
C must be identically null. Then, for all (X V ,Y), we have D7 (X V ,C(Y)) = 0 
or equivalently j(X v + C(Y)) = j(X v ) + 7 (C(F)). In particular, this holds for 
X v = C(X') for any X and any Y so that Z i— > 7 (C(Z)) is affine, that is, 7 is 
affine over Im(C'). For a random 7 it is improbable that 7 is affine over some 
(non-zero) sub-space. Hence, with high probability, C is identically null. □ 

This property of F naturally transports to the public key, provided the removal 
of polynomials do not completely destroy its algebraic structure: 

Claim 1 . If the number of coordinates removed by the projection II is less than 
half and the coefficients of 7 are randomly chosen, the set of linear mappings L 
satisfying 

MXMY D P(L(X),Y) - D P(X,L(Y)) = 0 

is {S -1 ° A u ,c 0 'f>}ueE,ceF g , i.e. the conjugates by the secret mapping S of all the 
multiplications A UtC : (X, X v ) 1— > (uX, cX v ), where u £ E and c £ F g . 

3.3 Extracting the Vinegar Vector Space 

The solution set X of Claim [Q can be easily determined as it amounts to solve 
a linear system of ( n — r){n + v) 2 equations in the (n + v) 2 unknowns of L over 
a finite field of size q. Let us call “vinegar vector space” the image through S of 
all the values v such that the n first coordinates of S(v) are zero. Similarly, let 
us call “normal vector space” the image through S of all the values v such that 
the v last coordinates equal zero. Before explaining how to use the knowledge 
of X to recover these two vector spaces, let us state three useful lemmas. 

Lemma 1 . Let u be in E, n u be the minimal polynomial ofu over¥ q , and \a u c 
be the characteristic polynomial of A UiC : (X,X V ) 1 —* (uX,cX v ). Then: 

Xa Ui A x ) = {x~ c) v ■ -k u (x)*£^ . 

Lemma 2. Let u be in E and n u the minimal polynomial of u over F g . Then: 

Tt u {x) = {x — u){x — u q ) ■ ■ ■ {x — u qABS ^ u) ) . 

Lemma 3 (Thm. 3.25 (I32j). The number of irreducible monic polynomials of 
degree n in F g [V] is ^ Sd|n where p is the Mobius function Q It follows 

that the number of elements in E with a minimal polynomial of degree n is at 
least q n — qz — q 2 -1 —■■■ — q 2 — q. 

Let M be any element picked at random from the solution set X of Claim El Since 
M = S’ -1 o A u , c o S for some ( u,c ) £ E x F g , M and A v c are conjugate and thus 
have the same characteristic polynomial xm{x) — [x— c) v ■ 7 r u (x) de s ^ according 
to Lemma D In addition, Lemma 0 shows that for u chosen uniformly at random 

1 /r(l) = 1, p(x) = (— l) fc for x a product of k distinct primes, and p(x) = 0 otherwise. 
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in E, deg(7r„) has more than 1 — q/(p chances to be n. We can therefore assume 
in the following that c and n u are known from the factorization of X- 

The factorization of n u over E in turn discloses u g for some unknown i. 
However, as stated at the end of Section IM.'I I the split decomposition is not 
affected by iterates of the Frobenius mapping and thus it is enough to solve 
for S in the following linear system: 

SoM = A uqic oS . 

Any particular solution So of this system is sufficient, since the whole space of 
solutions is a coset of the commutant of A uq * c . The commutant of X i— n u q X is 
the space of multiplications, since u does not belong to any subfield of E. On the 
contrary, the commutant of X v i— > c,X v is the whole space of F g -linear mappings, 
since precisely c lies in F g . At this point, the attacker is almost in the same 
position as the legitimate signer to produce a signature since he has access to 
the vinegar space through So and can now work on 

P O So\X, X v ) = n o T o (X 2 + 0X + y(X v )) 

instead of the original public key P. Let us define P = P o Sq 1 . 

The next step of the attack is to recover a mapping equivalent to T. To this 
end, we seek to cancel the part of P that is linear in X which can be achieved by 
using an adequate change of variables X i— > (X — b), where b is to be determined. 
The expression of P(X — b) with respect to X in turn contains a quadratic part, 
a linear part, and a constant part. Looking at the linear part alone, the attacker 
writes down that the set a coefficients of X are equal to zero; these coefficients 
are a set of (n — r) affine functions with respect to b and solving for b allows 
the attacker to recover f3. The final step is to recover an equivalent version of T. 
This is done by considering the part of P that is quadratic with respect to X: 
Q(X) = II o T(X' 2 ). By composing with multiplications over E, it is possible to 
complete the ( n — r ) coordinates of Q into a full set Q(X) of n coordinates by 
taking a basis of {Q(AA)}> eE . Then, solving for T in Q(X) = T(X 2 ) gives an 
equivalent representation To of T. 

At this point, the attacker gained the knowledge of So, To, and do such that: 
P O So\X , X v ) = n o To O {X 2 + poX) + P o So\ 0, X v ) . 

We claim that this is equivalent to the knowledge of the secret key since the 
attacker is then able to sign any message m as efficiently as the legitimate signer 
as follows. Draw some random value X v from the vinegar space and randomly 
complete the (n — r) coordinates of m — P o (0, X v ) into an n coordinates 
value m. Compute Y = T ( ^“ 1 (m) and solve for A 0 in (X + \fio) 2 = Y + \Pq. A 
signature of m is then given by Sq 1 (X 0 , X v ). 

3.4 Complexity Analysis and Practical Parameters 

Our attack requires 0(log 2 (f/)(n + v) 6 ) operations to find the solution set S of 
Claim [I] an d 0(\og 2 (q)(n + w) 3 ) operations to factor the characteristic polyno- 
mial x ■ The particular solution So is found with 0(log (q)(n + v) 6 ) operations. 
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The complexity of the other steps can be neglected and thus the attack has an 
overall complexity of O (log 2 (q)(n + «) 6 ) . 

The authors of the square- vinegar signature scheme claimed a 80-bits security 
for the following parameter sets: 



parameter set 1 

parameter set 2 

field size q 

31 

13 

normal unknowns n 

31 

36 

vinegar unknowns v 

4 

4 

removed polynomials r 

3 

3 


The complexity of our attack is about 2 35 and our Magma program in appendix 
completes within minutes for both parameter sets on a common desktop PC. 

4 Cryptanalysis of the Square Encryption Scheme 

The square encryption scheme poses new challenges to the attacker. Its design 
strategy of embedding the plaintext into a bigger space before applying the 
internal transformation makes it impossible to use the differential mapping as 
was done previously. This is due to the restricted view the attacker has on the 
input space which does not allow to manipulate the inner of the differential 
easily. In our attack against the square encryption scheme, we therefore use a 
different technique. Instead of peeling off the cryptosystem from the input, we 
peel it off from the output. 

4.1 Equivalent Representation of the Secret Key 

Due to the specific form of the internal transformation and without loss of gen- 
erality, we may give the following alternative decomposition of the public key: 

P(X) = T(S(X) 2 ) + T(s ■ S{X)) + 1 , (8) 

where S and T are the linear part of the original secret linear mappings and 
s = |cr and t = t + T(a 2 ) with a and r the original secret constants from E. 
Since the mappings S and T are linear, it can be easily seen that with respect to 
the input X, the first term of (0 is F g -quadratic, the second term is linear, and 
the third term is constant. Furthermore, these three homogeneous terms can be 
read directly on the public key itself, so that the attacker knows the following: 

P 2 (X) = T(S(X) 2 ) , P^X) =T(s ■ S(X)) , P 0 (X)=t . 

4.2 Looking for Invariant Subspaces 

As with the signature scheme, the differential of the public key provides useful 
information to the attacker. In the case of the square encryption scheme, it can 
be expressed as: 


D P(X, Y) = T ( 2 • S(X) ■ S(Y)) . 
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Let consider the partial mappings D P y : X i— *• D P{X,y). Since S : F” _r — ► E 
has full rank, its image is of dimension (n — r). Hence, choosing any linearly 
independent vectors yi, . .., y n -r makes D P yi , . .., DP yn r span the whole 
vector space of mappings {DP z } rt E. This shows that the attacker is able to 
derive a set of mappings A = {Pi} U each of which has the 

special form T o A a o S, where A a stands for the multiplication by a in E. This 
set of mappings can then be rewritten as A = {T o A Xi o S'} i=1; ! „_ r+1 where 
the n — r + 1 values Ai, . . . , A n - r +i are unknown, but linearly independent. 

The attacker does not need to know the actual value of the A,; since he can 
exploit this set of mappings in as follows. The general idea is to look for linear 
mappings L that can link the public equations, say two elements D i = ToA Xl oS 
and D 2 = T o A\ 2 o S from A. One natural idea is then to look for L such that: 

LoD x = D 2 , (9) 

since it can be easily checked that L 0 = T ° A -i oT -1 is a particular solution 
of ©. However, the solution space of (0 is not restricted to multiplications. 
This is due to the ‘embedding’ mechanism, i.e. the fact that the mapping S is 
not a one-to-one mapping, which release some of the constraints and allows less 
structured linear mapping to be solutions. 

A possible direction to solve this issue is to put more constraints on the 
mapping L while being careful to keep mappings of the form To A* o T -1 in the 
solution space. This is why we not only look for a linear mapping that solves (0, 
but several equations similar to 0 simultaneously. This can be reformulated in 
terms of A as follows. We look for linear mappings L such that: 

Vi € {l,...,m}, Lo(ToA Xi oS)e(ToA Xm+1 oS,...,ToAx n _ r + 1 oS) , (10) 

that is, the image through L of m elements of A must lie in the vector space 
spanned by the remaining elements of A. It is easy to see that if A is such that: 

V* £ {1, . . . , to}, A • A j £ (A m +1, . . . , An-r-1-i) , (11) 

then T o A\ o T -1 must be solution of (I I 1 111 . 

The parameter to controls the number of solutions of (I I I III and (I I I II . It can 
be used to simultaneously render system under-determined and system m 
over-determined. This ensures that no other solutions except than the conjugates 
of multiplications. We can determine suitable values of to as follows. For i < to, 
the fact that A-Aj lies in (A m +i, . . . , A n _ r +i) puts n— ((n—r+l)—m) constraints 
on the n coordinates of A in F q . As Ai, . . . , A n _ r+ i are linearly independent, 
the above constraints are independent. Hence (ITTI admits solutions as soon as 
n > m(n — (n — r + l—m)). Similarly, the whole space of linear mappings L has 
dimension n 2 and each equation of ( fTTHl puts n{n—r) — {n — r+l — m) constraints 
as mappings from A map F"~ r to F”. Therefore, system (II 1 III is over-determined 
as soon as n 2 < m{n{n — r) — {n — r + 1 — to)). These two conditions define 
a range of values of to such that the solution space of (ITUI) becomes isomorphic 
to the solution space of (ITTll . This behavior is entirely confirmed by our Magma 
implementation of the attack. 
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4.3 Recovery of the Secret Elements 

Once a linear mapping L = T o A\ o T -1 has been recovered, every element of 
the secret key can be computed. By proceeding just as for the signature scheme, 
the underlying multiplication A is revealed from the characteristic polynomial 
of L. An equivalent representation To of T is then recovered by solving for T 
in T o L = A\ o T. Let a be a randomly chosen element. The other component 
of the secret key can then be found via: 

S(a) = yjT^(P 2 {a)) , So =^- TV 1 (A (a)) , S 0 = i • Tq 1 o P t . 

(In the case where Tq 1 [P 2 (o)) is not a square in E, just replace T 0 by —To.) 


4.4 Practical Parameters 

The most time consuming step of our attack is to compute the solution space 
of (ITm which requires 0(log 2 (g)n 6 ) operations. The authors of the square en- 
cryption scheme claimed a 80-bit security for the following parameter sets: 



parameter set 1 

parameter set 2 

field size q 

31 

31 

unknowns n — r 

34 

51 

polynomials n 

37 

54 


but the complexity of our attack actually is about 2 36 operations for the first 
parameter set and about 2 39 for the second. Again, the key recovery written in 
Magma only requires a couple of seconds to complete on a standard workstation. 
During the attack, m = 2 was enough in practice to ensure that only conjugates 
of multiplications were solutions. 
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A Simple Auxiliary Functions for Our Magma Scripts 

Simple functions. The following function returns a root of ax 2 +bx + c. 

1 S0LVE_2ND_DEGREE:=f unction (a, b, c ) 

2 is_, sqrt_delta-.=\sSQUARE(b 2 -4*a*c) ; 

3 return /s_,(/s_ select (- b+sqrt_delta) / (2*a ) else 0); 

4 end function; 

Juggling between matrices and vectors: 

5 MAT2VEC:=ftync< Mat | Vector(Eltseq(Mat)) >; 

6 Vec2Mat: = func< vect, ncol | Matri x(ncol, ELTSEQ(i/ecf)) >; 

Space returns the vector space spanned by a set of matrices MS viewed as vectors: 

7 Space =func< MS, KK, dim \ 

s sut>< V ectorSpace(KK, dim) | [Mat 2 Vec(MS[/]) : i in [1..#MS]]> >; 

The following returns the matrix of x i— > Xx: 

9 Ml)LBY:=ft/nc< A, EtoV, VtoE, B \ 

10 Matrix([EtoV(VtoE(B[/])*A) : / in [1..#B]]) >; 

Sequences of coefficients. It can be convenient to represent a quadratic poly- 
nomial as sequences of coefficients of its homogeneous degree 0,1, and 2 com- 
ponents. C 012 takes a function P viewed as a sequence of n_pol polynomials on 
n_var variables and outputs the corresponding sequences CSO, CS1 , and CS2: 
n C 0 i 2 :=function(KK, Vjnput, P, n_pol, n_var) 

12 CS0:= [KK ! 0:// in [1 . . n_pol ]] ; 

is CS1 :=[[KK ! 0:/ in [1 . . n_var]):ii in [ 1 . . n_po/j] ; 

14 CS2:=[[[KK ! 0:/ in [1 . . /]]:/ in [1 . . n_var]]:ii in [1 . . n_po/j] ; 

15 X:=V_INPUT ! 0; y:=P(x) ; 

16 for //:= 1 to n_poi do CSO [//]:=/[//] ; end for; // nstant 

17 for /:= 1 to n_var do 

18 X:=V_INPUT ! 0; x[/]:=KK! 1; y i; =P(x); x[/]:=KK ! -1; y 2 :=P(x); 

19 for //': -1 to n_pol do 

20 CS1 [//][/]:= (yi [//] —y 2 [//])* (KK ! 2) 1 ; // coefficient of Xi, 

2 1 CS2 [/'/'] [/][/']:= (y ![/'/'] +y 2 [//'])* ( KK ! 2) ~ 1 -CS0[//] ; // and x 2 , 

22 end for; 

23 end for; 

24 for /:= 2 to n_var do for j:= 1 to /-I do 

25 x:=VjNPUT ! 0; x[/]:=KK ! 1 ; x[/]:= KK ! 1 ; y:=P(x) ; 

26 for //:= 1 to n_pol do 

27 CS2[//] [/] [/] := y [H] — CS2 [//] [/] [/] — CS2 [//] [/] [/] — 

28 CS1 [//](/]— CS1 [/'/'] [/'] — CS0[/7] ; // and x,x :n i + j 

29 end for; 

30 end for; end for; 

31 return CSO, CS1, CS2; 

32 end function; 
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Given three sequences of coefficients Co, Ci, and C 2 defined with respect to 
a quadratic polynomial P as above, compute the value taken by P on input x: 

33 Eval: =func< C 2 , Ci, C 0 , x, n_var \ 

34 &+[ &+[C 2 [i]\j]*x[i]*x\j] : j in [1../]] : / in [1 . . n_var]] 

35 + &+[Ci[/]*x[/]:/ in [1..n_i/ar]] + C 0 >; 

The next function computes the coefficients of the differential associated to 
the homogeneous form of degree 2 specified by the sequence of its coefficients: 

36 DiFF:=function(CS2, KK, n_pol, n_var) 

37 DP:=[ZeroMatrix(KK, n_var, n_var): ii in [1..n_po/]]; 

38 for /'/':= 1 to n_pol do 

39 for /:= 1 to n_var do 

40 DP[«] [/, /]:=2*CS2[//] [/'][/] ; 

41 for /:= 1 to /-I do 

42 DP[/7] [/,;] :=CS2[//] [/][/]; DP[//] [/, /] :=CS2[//] [/][/] ; 

43 end for; end for; end for; 

44 return DP; end function; 


B Magma Script to Attack the Signature Scheme 

An extension E of degree n over the base field A, also viewed as vector space V: 

45 q:=31 ; n:=31 ; v:=4; r:=3; K:=GF(q); E:=ext<K\n>; 

46 V, £21 /:=VectorSpace(£, K) ; V2E=E2V~'\ 

47 V_input:=VectorSpace(K, n+v) ; V_vinegar:=VectorSpace(K, v) ; 

48 V_message:=VectorSpace(K', n-r) ; V_random:=VectorSpace(K, r) ; 

We then randomly draw a secret key: the coefficient a, the linear mapping p, 
and the quadratic mapping 7 to form the internal transformation 

F : {X, X v ) ^ al 2 + /3(X V )X + 7 (X V ) , 

49 a:=V2E(A['\]) where A is RANDOM(GL(n, K)) ; // ensures a ^ 0 

50 /9 0 :=Random(£); /?i:=[Random (£):/ in [1 . . v]] ; 

51 /3:=func< Xv | &+[/?i[/]*Xv[/]:/ in [1 . . v]| + d 0 >; 

52 7 0 :=Random(£); 7i:=[Random(£):/ in [1 . . 1/]] ; 

53 7 2 :=[[Random(£):; in [1../]]:/ in [1. . v]]; 

54 7 :=func< Xv | 

55 &+[ &+M/][/]*Xv[/]*XvM : j in [1. ./]] : / in [1. . v]] + 

56 &+[7![/>Xv[/]:/ in [1. . v]] + 70 >; 

57 F:=func< X,XV | q:*X 2 +/3 (Xv)h=X+ 7 (Xv) >; 

and randomly draw input and ouput linear layers S and T : 

58 S 1 :=RANDOWl(GL(n+v, K)) ; S 0 :=Random(Vjnput) ; 

59 7i:=RANDOM(GL(n,fC)); 7 0 :=Random( 1/) ; 
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The corresponding public key is obtained via P = T o F o S: 

60 P:=function(/npuf) 

61 XX:=input*S\+So ; 

62 X:=1/2£'(Vector([XX[/]:/ in [1..n]])); // normal variables 

63 Xv:=Vector([XX[/]:/ in [n+1. . n+v]]); // vinegar variables 

64 return E2V( F(X,Xv) )*Ti+r 0 ; 

65 end function; 

The coefficients of homogeneous parts for the set of forms corresponding to P is 
obtained via: 

66 PubCO, PubCI, PubC2:=Coi 2 (K - , Vjnput, P, n-r, n+v ); 

We are now able to verify if a signature is valid: 

67 VERiFY:=function(msp, sig ) 

68 m~[ Eval(PubC2[/], PubCI [/], PubC0[/], sig, n+v ) : / in [1. . n— r]]; 

69 return &and[ m[i] eq msg[i ]: / in [1 . . n — rj] ; 

70 end function; 

We now compute an equivalent secret key. First, we look for the linear mappings 
MX verifying: Mx x DP — DP x Mx = 0. 

71 B:= BASiS(F); PR:=PolynomialRing(/(, (n+v) 2 ) ; 

72 Mx:=MATRix(n+v, [PR./:/ in [1. . (n+v) 2 ]]); 

73 DP:=Diff(PubC2, K, n—r, n+v) ; 

74 Eqs:=[Eltseq(Mx*DP[//]— DP[//]*Transpose(Mx)):// in [1..n— r]]; 

75 GB:=Q; 

76 for /'/':= 1 to n—r do 

77 GB:=GroebnerBasis(GB cat Eqs[//']); 

78 if #GB + n + 1 eq (n+v) 2 then break; end if; 

79 end for; 

We choose a particular solution M_ by removing the n + 1 degrees of freedom 
by fixing the remaining unknowns to random values, and extract the two roots 
c € K and a e E of the characteristic polynomial of M_. 
so repeat W:=GROEBNERBASis([PR.((n+v) 2 -/') + Random (K):i in [0. . n]] 
at GB); 

81 until not(W eq [PR ! 1] ) ; // complete consitently 

82 M_:=MATRix(n+i/, [K ! Evaluate( W[i], PR./'.O):/ in [1. . (n+ir) 2 ]]); 

83 CPol:=FactoredCharacteristicPolynomial(M_) ; 

84 if nof(#CPoL eq 2) then “Bad Char. Pol.”; exit; end if; 

85 c:=Roots(CPol[ 1][1])[1][1]; // factor of degree 1 

86 a: = R oots (Polynomialring (F) ! CPol[ 2][1])[1][1] ; // of degree n 

M_ must be similar to the matrix of ( X , X v ) >—> (aX, cX v ), which will disclose a 
particular solution S_ as useful to sign as S: 

87 /4:=MULBY(a, E2V, V2E, B) ; 

88 is_similar, S_:=IsSimilar(M_, DiagonalJoin(/4, ScalarMatrix(v, c)) ); 

89 if not(is_similar ) then “Recovering S_ failed.”; exit; end if; 
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Applying the change of base S_, we get (Z, Z v ) T(Z 2 + p ■ Z + 7 (Z v )): 

90 Z:=Vector([PR./:/ in [1 . . n+ v]])*MatrixAlgebra(PR, n+v) ! (S_); 

91 PubZ:=[Eval(PubC 2 [/], PubCI [/], PubC 0[/], Z, n+v):i in [1. . n-r]]; 

To get rid of the term (3 ■ Z, we look for Y such that the coefficient of Z in 
T((X + Yf + j3 ■ {Z + Y) + 7 (Z v ) becomes zero: 

92 1/ 0 :=Random(V_vinegar); 

93 ZpY:=[PR! 0:/ in [1. . (n+v) 2 ]]; //Z + Y 

94 for /':= 1 to n do ZpY[/']:=PR./+PR.(/+/ 7 +v) ; end for; 

95 for /':= 1 to v do ZpY[/'+tj]:= V 0 [/] ; end for; 

96 PubZv:=[Evaluate(PubZ[/],ZpY):/ in [1..n— /■]]; 

97 OY:=[PR ! 0:/ in [1 . . (n+1/) 2 ]] ; // (Z,Y) = (0,Y) 

98 for /':= 1 to n do OY[/'+n+i/]:=PR.(/'+n+i/) ; end for; 

99 EQLlN:=&Caf[[EVALUATE(COEFFICIENT(PUBZv[/'], PR.y, 1),OY) 

100 :j in 1. . n]]:/ in [1 . . n—r}\ ; // equations 2 Y = ft 

101 Y 0 :=GroebnerBasis(EqLin) ; 

102 befa_:=VECTOR([K ! EvALUATE(Y 0 [/],PR.(/+r7+i7),0):/ in [1..n]]); 

We are now able to get the polynomials corresponding to T(Z 2 + 7 (Z„)): 

103 for /:= 1 to n do ZpY[/']:=PR./'— beta_[i\ ; end for; 

104 PubZ 0 :=[Evaluate(PubZ[/],ZpY):/ in [1 . . n-r]] ; 

We recover go= 7(0) (remember vinegar part of ZpY was set to zero above): 

105 g 0 :=[K ! Evaluate(PubZ 0[/'] , [PR ! 0:/ in [1 . . ( n+v) 2 ]]):i in [1 . . n-r}} ; 
and thus Z T(Z 2 ) together with its differential (X, Y) i-> 2 XY 

106 PubZ 2 :=[PubZ 0[/]— £f 0 [/]:/ in [1..n— r]]; 

107 DPubZ 2 :=[Submatrix(S_*DP[/]*Transpose(S_), 1 , 1 , r?, n):i in [1..n-r]]; 
but also (X, Y) ^ 2a 2 XY: 

108 DPubZa:=[A*DPubZ 2 [/]*Transpose(A):/' in [A., n-r]]; 

This allows us to complete T into a full rank mapping T_ via T_(X) = i DP(X , 1): 

109 SPa:=Space(DPubZ 2 cat DPubZa, K, n*n) ; SP2 :=Space(DPubZ 2, K, n*n) ; 

110 W:=Basis(Complement(SPa, SP 2)); 

111 DPpluS:=DPubZ 2 cat [Vec 2 Mat( W[i], n) : / in [1..#W]]; 

112 T_:=(K ! 2) _1 *Matrix([Vector([(B[/]*DPplus[/],B[ 1]) 

113 :j in [1 . . njj) : / in [1..n]]); 
and to forge a signature for any message: 

114 msg:= Random(V_message) ; 

115 repeat 

116 Y:=VECTOR(ELTSEQ(msp-VECTOR(go)) cat Eljseq(Random(V_random))) ; 

117 is_square,sqrX:= IsSquare( V2E(Y*T_ ‘ 1 ) ); until is_square\ 

118 forged := V ector (Eltseq( E2V(sqrX)-beta_ ) cat Eltseq( 1/ 0 ))*S_; 

119 if Verify (msg, forged) then “Forged signature.”; end if; 
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C Magma Script to Attack the Encryption Scheme 

120 q:=31 ; n:= 37; r.= 3; K:=GF(q) ; E:=ext<K\n>\ 

121 Vi:=VectorSpace(K, n—r) ; Vo, K21 /:=VectorSpace(E, K) ; V2K:=K2V~ J[ ; 
Build the secret key, the encryption function P, and coefficients: 

122 Li:=Submatrix(Random(GL(i7, K)), 1 , 1 , n-r, n) ; L 2 :=RANDOM(GL(n, K)) ; 

123 /i:=RANDOM(GL(n,K))[1]; / 2 :=Random(Vo) ; 

124 Pencrypt: =func< plain \ K2 V ( V2K (p/a/'n * C + /i ) 2 ) * /_ 2 + / 2 >; 

125 PubCO, PubCI, PubC2 := C 0 i 2 (K\ Vi, Pencrypt, n, n-r)] 

The mappings A = {D_P Vi } ie [ l n _ r .] for linearly independant y-[ , . . . , y n - r - 

126 DP:=Diff(PubC2, K, n, n-r ); V':=RANDOM(GL(n-r, K - )) ; 

127 Z\:=[TRANSPOSE(MATRix([y[k]*DP[/] : / in [1..n]])) : k in [1 . . n — r]] ; 

The set A of linear mappings verifying (0 for some parameter m: 

128 m:= 2; S:=[A[i]: i in [m+'\. . n—r]\; SP:=Space(<5, K, (n— r)*n) ; 

129 Dual:=Transpose(NullspaceMatrix(Transpose(BasisMatrix(SP)))); 

130 Pi:=Transpose(Matrix(PubC1)); B:=Basis(VectorSpace(K - , n 2 )) ; 

131 MMul :=func<A \ Matrix([Mat2Vec(>4*Vec2Mat(B[/], n)): i in [1..#B]])>; 

132 Yl:=&/neef[NuLLSPACE(MMuL(zi[/])*DuAL): / in [1 . . m}} 

133 meet Nullspace(MMul(Pi)*Dual) ; 

Compute the characteristic polynomial CP of a random linear mapping in A: 

134 /W:=Vec2Mat(Random(y1), n) ; CP:=FactoredCharacteristicPolynomial(M) ; 

135 a: = R oots (Polynomialring (B) ! CP[1][1])[1][1] ; 

136 /4:=MULBY(a, K2V , V2K, Basis(Vo)); 

Recover the secret elements: 

137 res,L2_:= IsSimilar(/W, A) ; R:=Random(Vi) ; 

138 v:=1/2K'(Vector([(B*DP[/'],R): j in [1 . . n]])*L2_ _1 )/2; 

139 res, s:=IsSquare( v) ; 

140 if not res then L2_\=—L2_\ res, s:=IsSquare(— v) ; end if; 

ui I1_-.=K2V(V2K (R* Pi*L2_~ 1 ) / (2*s))] 

142 L1_:=P 1 *L2_~UUulB\(1 / V2K{2*I1_), K2V , V2K, Basis(Vo)) ; 

143 /2_:=Pencrypt(Vi ! 0)-K2V(V2K(l1_) 2 )*L2_] 

144 lML1_:=su6<Vo|[Lf_[/]:/' in [1..n-r]]>; 

145 DiS0LOSE:=function(c/pher) // unlegitimate decryption! 

146 is_square, roof := IsSquare( V2K ( ( cipher -I2_)* L2_ - 1 ) ) ; 

147 if is_square then Z:=K2V(root ) ; 

148 if (Z— 11_) in ImL 1_ then return true, SoLUTiON(Lf_, Z—I1_)\ 

149 else if (— Z— 11_) in ImL 1_ then return true, SoLUTiON(Z_f_, — Z— !1_)\ 

iso else return false, end if; end if; else return false, _; end if; 

151 end function; 

152 p/a/n:=RANDOM(Vl) ; h,p:=DlSCLOSE(PENCRYPT(p/a/'o)) ; 

153 if b and (p eq plain) then “Decryption successful.”; end if; 
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Abstract. We present a new algorithm based on binary quadratic forms 
to factor integers of the form N = pq 2 . Its heuristic running time is expo- 
nential in the general case, but becomes polynomial when special (arith- 
metic) hints are available, which is exactly the case for the so-called NICE 
family of public-key cryptosystems based on quadratic fields introduced 
in the late 90s. Such cryptosystems come in two flavours, depending 
on whether the quadratic field is imaginary or real. Our factoring al- 
gorithm yields a general key-recovery polynomial-time attack on NICE, 
which works for both versions: Castagnos and Laguillaumie recently ob- 
tained a total break of imaginary- NICE, but their attack could not apply 
to reoI-NICE. Our algorithm is rather different from classical factoring 
algorithms: it combines Lagrange’s reduction of quadratic forms with a 
provable variant of Coppersmith’s lattice-based root finding algorithm for 
homogeneous polynomials. It is very efficient given either of the following 
arithmetic hints: the public key of imaginary- NICE, which provides an 
alternative to the CL attack; or the knowledge that the regulator of the 
quadratic field Q(y/p) is unusually small, just like in razl-NICE. 

Keywords: Public- key Cryptanalysis, Factorisation, Binary Quadratic 
Forms, Homogeneous Coppersmith’s Root Finding, Lattices. 


1 Introduction 

Many public-key cryptosystems require the hardness of factoring large integers 
of the special form N = pq 2 , such as Okamoto’s Esign fOkaDOj . Okamoto and 
Uchiyama’s encryption |()IJ98| . Takagi’s fast RSA variants [Takf)8| . and the large 
family (surveyed in |BTV04j ) of cryptosystems based on quadratic fields, which 
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was initiated _by Buchmann and Williams’ key exchange |BW88j . and which 
includes N I CE0 cryptosystems |HFT99IIFT99IIFT 00I I.ISW()%j (whose main feature 
is a quadratic decryption). These moduli are popular because they can lead 
to special functionalities (like homomorphic encryption) or improved efficiency 
(compared to RSA). And no significant weakness has been found compared to 
standard RSA moduli of the form N = pq: to the best of our knowledge, the only 
results on pq 2 factorisation are |F()9filfPcr011IBI)H99l . More precisely, [P( )9fil 
I Per01 j obtained a linear speed-up of Lenstra’s ECM, and [HI ) H *4*41 Sect. 6] can 
factor in time 0(A' 1 / 9 ) when p and q are balanced. Furthermore, computing 
the “squarefree part” of an integer (that is, given N £ N as input, compute 
(r, s) £ N 2 such that N = r 2 s with s squarefree) is a classical problem in 
algorithmic number theory (cf. |AM?M| i . because it is polynomial-time equivalent 
to determining the ring of integers of a number field |( Jhi89j . 

However, some of these cryptosystems actually provide additional informa- 
tion (other than N) in the public key, which may render factorisation easy. 
For instance, Howgrave-Graham jHowOI j showed that the public key of |( )ka,8fij 
disclosed the secret factorisation in polynomial time, using the gcd extension 
of Coppersmith’s root finding method |Cop97| . Very recently, Castagnos and 
Laguillaumie |CL09| showed that the public key in the imaginary version |H F I '991 
iPT99i PTOl) of NICE allowed to retrieve the secret factorisation in polynomial 
time. And this additional information in the public key was crucial to make 
the complexity of decryption quadratic in imaginary- NICE, which was the main 
claimed benefit of NICE. But surprisingly, the attack of |CL09| does not work 
against REAL- NICE j.lSW(!8j . which is the version of NICE with real (rather than 
imaginary) quadratic fields, and which also offers quadratic decryption. In par- 
ticular, the public key of REAL-NICE only consists of N = pq 2 , but the prime p 
has special arithmetic properties. 

Our Results. We present a new algorithm to factor integers of the form 
N = pq 2 , based on binary quadratic forms (or equivalently, ideals of orders of 
quadratic number fields). In the worst case, its heuristic running time is exponen- 
tial, namely 0(p 1 ^ 2 ). But in the presence of special hints, it becomes heuristically 
polynomial. These hints are different from the usual ones of lattice-based factor- 
ing methods |Cop97|IBL)H99IIHow01j where they are a fraction of the bits of the 
secret prime factors. Instead, our hints are arithmetic, and correspond exactly 
to the situation of NICE, including both the imaginary [HF’I’Cblll^ T99IIF TCTTij 
and real versions j.!SW08j . This gives rise to the first general key-recovery 
polynomial-time attack on NICE, using only the public key. 

More precisely, our arithmetic hints can be either of the following two: 

i. The hint is an ideal equivalent to a secret ideal of norm q 2 in an imaginary 
quadratic field of discriminant — pq 2 : in NICE, such an ideal is disclosed by the 
public key. This gives an alternative attack of NICE, different from Ol. 

ii. The hint is the knowledge that the regulator of the quadratic field Q(y/p) is 
unusually small, just like in REAL-NICE. Roughly speaking, the regulator is a 
real number which determines how “dense” the units of the ring of integers 
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of the number field Q(y/p) are. This number is known to lie in the large 
interval [log (\{s/p — 4 + y/pj) , \J\p log p + l)J . But for infinitely many 
p (including square-free numbers of the form p = k 2 + r, where p > 5, r|4fc 
and — k < r < k, see |L)eg58| ), the regulator is at most polynomial in log p. 
For these unusually small regulators, our algorithm heuristically runs in time 
polynomial in the bit-length of N = pq 2 , which gives the first total break of 
REAL-NICE |.ISW08| . We stress that although such p’s are easy to construct, 
their density is believed to be arbitrary small. 

Interestingly, our algorithm is rather different from classical factoring algo- 
rithms. It is a combination of Lagrange’s reduction of quadratic forms with a 
provable variant of Coppersmith’s lattice-based root finding algorithm |Cop97| 
for homogeneous polynomials. In a nutshell, our factoring method first looks for 
a reduced binary quadratic form f(x , y) = ax 2 + bxy + cy 2 representing prop- 
erly q 2 with small coefficients, i.e. there exist small coprime integers xq and yo 
such that q 2 = f{xq, yo). In case i., such a quadratic form is already given. In 
case ii., such a quadratic form is found by a walk along the principal cycle of 
the class group of discriminant pq 2 , using Lagrange’s reduction of (indefinite) 
quadratic forms. Finally, the algorithm finds such small coprime integers xq and 
yo such that q 2 = f(xo,yo), by using the fact that gcd(f(xo,yo),pq 2 ) is large. 
This discloses q 2 and therefore the factorisation of N. In both cases, the search 
for xo and yo is done with a new rigorous homogeneous bivariate variant of Cop- 
persmith’s method, which might be of independent interest: by the way, it was 
pointed out to us that Bernstein [Ber()8| independently used a similar method 
in the different context of Goppa codes decoding. 

Our algorithm requires “natural” bounds on the roots of reduced quadratic 
forms of a special shape. We are unable to prove rigorously all these bounds, 
which makes our algorithm heuristic (like many factoring algorithms). But we 
have performed many experiments supporting such bounds, and the algorithm 
works very well in practice. 

Factorisation and Quadratic Forms. Our algorithm is based on quadratic 
forms, which share a long history with factoring (see jCF01| b Fermat’s factoring 
method represents N in two intrinsically different ways by the quadratic form 
x 2 + y 2 . It has been improved by Shanks with SQUFOF, whose complexity is 
(^(jV 1 / 4 ) (see |GW08| for a detailed analysis). Like ours, this method works 
with the infrastructure of a class group of positive discriminant, but is different 
in spirit since it searches for an ambiguous form (after having found a square 
form), and does not focus on discriminants of a special shape. Schoof’s factoring 
algorithms |Sch82j are also essentially looking for ambiguous forms. One is based 
on computation in class groups of complex quadratic orders and the other is 
close to SQUFOF since it works with real quadratic orders by computing a 
good approximation of the regulator to find an ambiguous form. Like SQUFOF, 
this algorithm does not takes advantage of working in a non-maximal order 
and is rather different from our algorithm. Both algorithms of |Sch82j runs in 
O ( TV 1 / 5 ) under the generalised Riemann hypothesis. McKee’s method |McK99j 
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is a speedup of Fermat’s algorithm (and was presented as an alternative to 
SQUFOF) with a heuristic complexity of 0(7V 1,/4 ) instead of d(N 1 / 2 ). 

SQUFOF and other exponential methods are often used to factor small num- 
bers (say 50 to 100 bits), for instance in the post-sieving phase of the Number 
Field Sieve algorithm. Some interesting experimental comparisons can be found 
in mm- Note that the currently fastest rigorous deterministic algorithm actu- 
ally has exponential complexity: it is based on a polynomial evaluation method 
(for a polynomial of the form x(x — 1) • • • (x — B + 1) for some bound B) and its 
best variant is described in |bUlS07j . Finally, all sieve factoring algorithms are 
somewhat related to quadratic forms, since their goal is to find random pairs 
(x, y) of integers such that x 2 = y 2 mod N. However, these algorithms factor 
generic numbers and have a subexponential complexity. 

Road Map. The rest of the paper is organised as follows. The first section 
recalls facts on quadratic fields and quadratic forms, and present our heuristic 
supported by experiments. The next section describes the homogeneous Copper- 
smith method and the following exhibits our main result: the factoring algorithm. 
The last section consists of the two cryptanalyses of cryptosystems based on real 
quadratic fields (REAL-NICE) and on imaginary quadratic fields (NICE). 


2 Background on Quadratic Fields and Quadratic Forms 

2.1 Quadratic Fields 

Let D ^ 0, 1 be a squarefree integer and consider the quadratic number field 
K = Q(VV). If D < 0 (resp. D > 0), K is called an imaginary (resp. a real ) 
quadratic field. The fundamental discriminant Ax of K is defined as Ax = D 
if D = 1 (mod 4) and Ax = 4 D otherwise. An order O in I\ is a subset of K 
such that O is a subring of K containing 1 and O is a free Z-module of rank 
2. The ring Oa k of algebraic integers in K is the maximal order of K. It can 
be written as Z + uqfZ, where uk = \(A K + y/A K ). If we set / = [Oa k ■ O] 
the finite index of any order O in Oa k , then O = Z + /(UifZ. The integer / 
is called the conductor of O. The discriminant of O is then Af = f 2 Ax- Now, 
let Oa be an order of discriminant A and a be a nonzero ideal of Oa, its norm 
is N(a) = \Oa/*\- A fractional ideal is a subset a C K such that da is an ideal 
of Oa for d £ N. A fractional ideal a is said to be invertible if there exists 
an another fractional ideal b such that ab = Oa- The ideal class group of Oa is 
CIO a) = I(Oa)/P(Oa), where 1(0 a) is the group of invertible fractional ideals 
of Oa and P{Oa) the subgroup consisting of principal ideals. Its cardinality is 
the class number of Oa denoted by Ii(Oa)- A nonzero ideal a of Oa, a is said 
to be prime to f if a + fOA = Oa- We denote by 1(0 a, f) the subgroup of 
1(0 a) of ideals prime to /. The group 0* A of units in Oa is equal to {±1} for 
all A < 0, except when A is equal to —3 and —4 (0 *_ 3 and 0*_ 4 are respectively 
the group of sixth and fourth roots of unity). When A > 0, then 0* A = (— 1, £a) 
where ea > 0 is called the fundamental unit. The real number Ra = log(£/j) is 
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the regulator of Oa- The following important bounds on the regulator of a real 
quadratic field can be found in j.lbWH5l : 

logQ(v^4 + VZ))<^<yPQlogZl+l). ( 1 ) 

The lower bound is reached infinitely often, for instance with A = x 2 + 4 with 
2\ x. Finally, this last proposition is the heart of both NICE and REAL-NICE. 
Proposition 1 ( fC lox99L Proposition 7.20] |Wei041 Theorem 2.16]). Let 

Oa s be an order of conductor f in a quadratic field K. 

i. If 9 is an Oa k -ideal prime to f, then 9 fl Oa s is an Oa s -ideal prime to f 
of the same norm. 

ii. If a is an C>A f -ideal prime to f , then a O a k is an Oa k -ideal prime to f of 
the same norm. 

Hi. The map ipf : I(OA f ,f ) — > I(OA K ,f), a >-*• bOa k is an isomorphism. 

The map iff from Proposition Q induces a surjection (pf : C{0 a s ) -» C(Oa k ) 
which can be efficiently computed (see jPTOOj b In our settings, we will use a 
prime conductor f = q and consider A q = q 2 Ax, for a fundamental discriminant 
A k . In that case, the order of the kernel of (p q is given by the classical analytic 
class number formula (see for instance mm) 

K°A q ) _ ( q - ( A K /q ) if A k < -4, 

h{0 AK ) 1(5- (A K /q))R AK /RA q if A k > 0. W 

Note that in the case of real quadratic fields, CA q = e t A}< for a positive integer 
t, hence Ra q /Ra k = t and t \ (q— ( Axjq ))• 


2.2 Representation of the Classes 


Working with ideals modulo the equivalence relation of the class group is essen- 
tially equivalent to work with binary quadratic forms modulo SL 2 (Z) (cf. Section 
5.2 of |( fohflflj b Moreover, quadratic forms are more suited to an algorithmic 
point of view. Every ideal a of Oa can be written as a = m (aZ + ~ b+ f^ Z^ 
with m £ Z, a e N and b £ Z such that b 2 = A (mod 4a). In the remainder, 
we will only consider primitive integral ideals, which are those with m = 1 . 
This notation also represents the binary quadratic form ax 2 + bxy + cy 2 with 
b 2 — 4 ac = A. This representation of the ideal is unique if the form is normal 
(see below). We recall here some facts about binary quadratic forms. 


Definition 1 . A binary quadratic form f is a degree 2 homogeneous polynomial 
f(x,y) = ax 2 + bxy+cy 2 where a, b and c are integers, and is denoted by [ a,b,c ]. 
The discriminant of the form is A = b 2 — 4 ac. If a > 0 and A < 0, the form is 
called definite positive and indefinite if A >0. 


Let M S SL' 2 (Z) with M = an< ^ / = [ a - b, c] , a binary quadratic form, 

then f.M is the equivalent binary quadratic form f(ax + fly, 72 ; + Sy). 
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Definite Positive Forms. Let us first define the crucial notion of reduction. 

Definition 2. The form f = [a, b, c] is called normal if —a < b < a. It is called 
reduced if it is normal, a < c, and if b > 0 for a = c. 

The procedure which transforms a form / = [a, b, c] into a normal one consists 
in setting s such that b + 2sa belongs to the right interval (see |HV(17I (5-4)]) 
and producing the form [a, b + 2 sa,as 2 + bs + c]. Once a form / = [ a,b,c ] is 
normalised, a reduction step consists in normalising the form [c, — b, a]. We de- 
note this form by p(f) and by Rho a corresponding algorithm. The reduction 
then consists in normalising /, and then iteratively replacing / by p(f ) until / 
is reduced. The time complexity of this (Lagrange-Gaufi) algorithm is quadratic 
(see jBV07| b It returns a reduced form g which is equivalent to / modulo SL 2 (Z). 
We will call matrix of the reduction, the matrix M such that g = f.M. The re- 
duction procedure yields a uniquely determined reduced form in the class modulo 
SL 2 (Z). 

Indefinite Forms. Our main result will deal with forms of positive discrimi- 
nant. Here is the definition of a reduced indefinite form. 

Definition 3. The form f = [a, b, c] of positive discriminant A is reduced if 
|-\/A — 2|a|| < b < V~A and normal if — |a| < b < \a\ for |a| > \/~A, and 
\J~A — 2|a| < b < V~A for |a| < \J~A. 

The reduction process is similar to the definite positive case. The time complexity 
of the algorithm is still quadratic (see |B V07I Theorem 6.6.4]). It returns a 
reduced form g which is equivalent to / modulo SI_ 2 (Z). The main difference 
with forms of negative discriminant is that there will in general not exist a 
unique reduced form per class, but several organised in a cycle structure i. e., 
when / has been reduced then subsequent applications of p give other reduced 
forms. 

Definition 4. Let f be an indefinite binary quadratic form, the cycle of f is 
the sequence {p l (g))ie z where g is a reduced form which is equivalent to f . 

From Theorem 6.10.3 from |HV07j . the cycle of / consists of all reduced forms 
in the equivalence class of /. Actually, the complete cycle is obtained by a finite 
number of application of p as the process is periodic. It has been shown in 
fBT W95j that the period length l of the sequence of reduced forms in each class 
of a class group of discriminant A satisfies ^ — j^rf + 1 • 

Our factoring algorithm will actually take place in the principal equivalence 
class. The following definition exhibits the principal form of discriminant A. 

Definition 5. The reduced form [1, [V^AJ, ( |_v/^J 2 — 4i)/4] of discriminant A 
is called the principal form of discriminant A, and will be denoted 1a- 
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2.3 Reduction of the Forms [q 2 ,kq,(k 2 ±j>)/4] and Heuristics 

In this subsection, p and q are two distinct primes of the same bit-size A and 
p = 1 mod 4 (resp. p = 3 mod 4) when we deal with positive (resp. negative) 
discriminant. Our goal is to factor the numbers pq 2 with the special normalised 
quadratic forms [q 2 , kq, (k 2 + p)/4] or [q 2 , kq, (k 2 — p) / 4] , depending whether we 
work with a negative discriminant A q = —pq 2 or with a positive one A q = pq 2 . 
If p and q have the same size, these forms are clearly not reduced neither in the 
imaginary setting nor in real one. But as we shall see, we can find the reduced 
forms which correspond to the output of the reduction algorithm applied on 
these forms. 

Suppose that we know a form f k , either definite positive or indefinite, which 
is the reduction of a form fk = [ q 2 ,kq , ( k 2 ±p)/4] where k is an integer. Then 

fk represents the number q 2 . More precisely, if M k = e SL 2 (Z) is the 

matrix such that f k = f k -M k , then fk-M^ 1 = f k and q 2 = / fc (l, 0) = f k (S, -7). 
In Section El we will see that provided they are relatively small compared to 
A q , the values 5 and —7 can be found in polynomial time with a new variant 
of Coppersmith method. Our factoring algorithm can be sketched as follows: 
find such a form f k and if the coefficients of M k are sufficiently small, retrieve 
5 and —7 and the non-trivial factor q 2 of A q . In this paragraph, we give some 
heuristics on the size of such a matrix M k and discuss their relevance. If M is a 
matrix we denote by \M\ the max norm, i. e., the maximal coefficient of M in 
absolute value. 

In the imaginary case, it is showed in the proof of JCL09I Theorem 2] that 
the forms f k belong to different classes of the kernel of the map (p q , depending 
on k, so the reduced equivalent forms fk are the unique reduced elements of the 
classes of the kernel. To prove the correctness of our attack on NICE, we need 
the following heuristic (indeed, the root finding algorithm of Section 0 recovers 
roots up to \A q \ 1 ^)-. 

Heuristic 1 (Imaginary case). Given a reduced element f k of a nontrivial 
class ofker(p q , the matrix of reduction M k is such that \M k \ < |Z\ g | 1 / 9 with 
probability asymptotically close to 1. 

In the full version, we prove a probabilistic version of Heu ristic 1. From 
Lemma 5.6.1 of [BVt)7j . \M k \ < 2ma x{g 2 ,(fc 2 + p) / &} / y/ pq 2 ■ As f k is nor- 
malised, fc < q and \M k \ < 2q/^/p « |Z\ g | 1//6 . Note that we cannot reach such a 
bound with our root finding algorithm. Experimentally, for random k, M k can 
be much smaller. For example, if the bit-size A of p and q equals 100, the mean 
value of \M k \ is around \A q \ [ /' n - 7 . Our heuristic can be explained as follows. 
A well-known heuristic in the reduction of positive definite quadratic forms (or 
equivalently, two-dimensional lattices) is that if [a, b, c] is a reduced quadratic 
form of discriminant A, then a and c should be close to y/A. This cannot hold for 
all reduced forms, but it can be proved to hold for an overwhelming majority of 
reduced forms. Applied to f k = [a, b, c] , this means that we expect a and c to be 


476 G. Castagnos et al. 


close to \A q \ 1 t 2 . Now, recall that q 2 = /&($;, —7) = ad 2 — bSj + cry 2 , which leads 
to 6 and 7 close to y/q 2 /a = q/pa « g/lAjj 1 / 4 ~ l^<?| 1//12 - Thus, we expect that 
\Mk\ < \A q \ i / 12 . And this explains why we obtained experimentally the bound 
Figure |l(a)| shows a curve obtained by experimentation, which gives 
the probability that \Mk\ < A ? | * / 9 for random k, in function of A. This curve 
also supports our heuristic. 

In the real case, we prove in the following theorem that RA q /RA K forms f\ 
are principal and we exhibit the generators of the corresponding primitive ideals. 

Theorem 1. Let A k be a fundamental positive discriminant, A q = A^q 2 where 
q is an odd prime conductor. Let ea k (resp. ea q ) be the fundamental unit of 
Oa k (resp. OaJ and t such that £ Ak = ea 9 ■ Then the principal ideals of Oa 9 
generated by qc AK correspond to quadratic forms f m) = [ q 2 , k(i)q, ( k(i ) 2 — p)/4] 
with i G {1, . . . , t — 1} and k(i) is an integer defined modulo 2 q computable from 
e\ K mod q. 

Proof. Let ay = qc\ K with i G {1 , . . . ,t— 1}. Following the proof of jBTWQhl 
Proposition 2.9], we detail here the computation of a i = aiOA q • Let 27 and yi 
be two integers such that e\ K = 27 + yiOJK- Then a, = qx, + yiqA K (l — q)/2 + 
2/ji(A g + y/A^), and 07 is an element of Oa q - Let m, : , a, and bi be three integers 

such that &i = m, 4 — As mentioned in the proof of |BTW95I 

Proposition 2.9], m, is the smallest positive coefficient of ,fA~ q j2 in a*. As Oa,, 
is equal to Z + (A q + \f~A q ) /2Z, oyO^ is generated by 07 and oti(A q + ^f~A q )j2 
as a Z-module. So a simple calculation gives that rn l = gcd(y,, q(x, + j/,;A^/2)). 
As e 1 Ak is not an element of Oa q , we have gcd(y,;, q) = 1 so m ? ; = gcd(y l; x, + 
yiAx/T). The same calculation to find to' for the ideal c\ k Oa k reveals that 
mi = to'. As e 1 a Oa k = Oa k we must have to' = 1. Now, N(flj) = |N(aj)| = q 2 
and N(a,) = m~a,; = a, and therefore a,; = q 2 . Let us now find bi. Note that 
bi is defined modulo 2a, . Since 07 G ohOa„i there exist /q and 1 7 such that 
a, = aim + (—bi + y / Z^)/2t' i . By identification in the basis (1, yfA^f), Vk = 1 
and by a multiplication by 2, we obtain 2 qx, + qyiAx = —bi,y, (mod 2cq). As 
bi = A q (mod 2), we only have to determine bi modulo q 2 . As pi is prime to 
q, we have bi = k(i)q (mod q 2 ) with k(i) = —2xi/yi — Ak (mod q). Finally, as 
we must have —07 < b < a, if a, > \f~A q and else \fA q — 2a, < b < y/~A q , 
k(i ) is the unique integer with k(i) = A q (mod 2) and k(i ) = —2 Xifyi — Ak 
(mod q), such that b= k(i)q satisfies that inequalities. Eventually, the principal 
ideal of OA q generated by qc l A K corresponds to the form [q 2 . k(i)q, cj with Cj = 
(b 2 -A q )/(Aai) = (k(i) 2 -A K )/A. □ 

From this theorem, we see that if we go across the cycle of principal forms, 
then we will find reduced forms fk- To analyse the complexity of our factor- 
ing algorithm, we have to know the distribution of these forms on the cycle. 
An appropriate tool is the Shanks distance d (see |BV07I Definition 10.1.4]) 
which is close to the number of iterations of Rho between two forms. One has 
d(l A q ,fk(i)) — IRa k - From Lemma 10.1.8 of jBV07l| . |d(A(i), /&(*)) | < logy, for 
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(a) Imaginary 


(b) Real c 



Fig. 1. Probability that \M k \ < \A q \ 1 ^ 9 in function of the bit-size A of p and q 


all i = 1,2, 1. Let j be the smallest integer such that 0 < JRa k — 

2 log q, then as jR Ak = d(f Ki) , f k (i+j)) = d{f k{i) , f Ki) ) + d{f Hi) , f k (i+j)) + 
d{f k (i+j), from the triangle inequality, one has jRA K < 2 log (q) + 

\d(f k (i), f k (i+j))\- So, K/ fe (i),/ fc (i + j))| > JRa k ~ 21og q > 0. This inequality 
proves that f k ^ and fk{i+j) do not reduce to the same form. Experiments actu- 
ally show that asymptotically, |d(/fc(i), fk(i))\ is very small on average (smaller 
than 1). As a consequence, as pictured in figure El d(lA g , f k (i)) ~ iRa k - 



Fig. 2. Repartition of the forms f k (i) along the principal cycle 


Moreover, as in the imaginary case, experiments show that asymptotically the 
probability that the norm of the matrices of reduction, \M k \ is smaller than Z\ f J/ 9 
is close to 1 (see figure ^b]|. This leads to the following heuristic. 

Heuristic 2 (Real case). From the principal form 1 a q ■ a reduced form f k 
such that the matrix of the reduction, M k , satisfy \M k \ < , can he found in 

0{Ra k ) successive applications of Rho. 

We did also some experiments to investigate the case where the bit-sizes of p 
and q are unbalanced. In particular when the size of q grows, the norm of the 
matrix of reduction becomes larger. For example, for a 100-bit p and a 200-bit q 
(resp. a 300-bit q), more than 95% (resp. 90%) of the f k have a matrix M k with 
\M k \ < Al /6 - 25 (resp. \M k \ < 4 /5 - 44 ). 
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3 A Rigorous Homogeneous Variant of Coppersmith’s 
Root Finding Method 

Our factoring algorithm searches many times for small modular roots of de- 
gree two homogeneous polynomials and the most popular technique to find 
them is based on Coppersmith’s method (see |Cop97| or May’s survey |May07| ). 
Our problem is the following: Given /( x, y) = x 2 + bxy + cy 2 a (monic) bi- 
nary quadratic form and N = pq 2 an integer of unknown factorisation, find 
(xo,Vo) £ Z 2 such that f(xo,yo ) = 0 (mod q 2 ), while |®o|>|j/o| < M , where 
MsN. The usual technique for this kind of problems is only heuristic, since it 
is the gcd extension of bivariate congruences. Moreover, precise bounds cannot 
be found in the litterature. Fortunately, because our polynomial is homogeneous, 
we will actually be able to prove the method. This homogenous variant is quite 
similar to the one-variable standard Coppersmith method, but is indeed even 
simpler to describe and more efficient since there is no need to balance coeffi- 
cients. We denote as || • || the usual Euclidean norm for polynomials. The main 
tool to solve this problem is given by the following variant of the widespread 
elementary Howgrave-Graham’s lemma |How97| . 

Lemma 1. Let B £ N and g(x. y) £ Z [x,y] be a homogeneous polynomial of 
total degree 5. Let M > 0 be a real number and suppose that ||<7(®, §)|[ < 
then for all xq,yo € Z such that g(xo,yo) = 0 (mod B) and |®o|) \yo\ < M, 
g(xo,yo ) = 0. 

Proof. Let g(x, y) = o 9ix' l y & ~' 1 where some g t s might be zero. We have 

Isteo.s/o)! < Si= 0 Iflilk oa/o - *! 5 m5 T,Uo N 
< M s VS + T\\g(x,y)\\ < B 


and therefore g(x o, yo) =0. □ 

The trick is then to find only one small enough bivariate homogeneous 
polynomial satisfying the conditions of this lemma and to extract the ratio- 
nal root of the corresponding univariate polynomial with standard techniques. 
On the contrary, the original Howgrave-Graham’s lemma suggests to look for 
two polynomials of small norm having (xo,yo) as integral root, and to recover 
it via elimination theory. The usual way to obtain these polynomials is to form 
a lattice spanned by a special family of polynomials, and to use the LLL algo- 
rithm (cf. |Lbb82j ) to obtain the two “small” polynomials. Unfortunately, this 
reduction does not guarantee that these polynomials will be algebraically inde- 
pendent, and the elimination can then lead to a trivial relation. Consequently, 
this bivariate approach is heuristic. Fortunately, for homogeneous polynomials, 
we can take another approach by using Lemma Q and then considering a uni- 
variate polynomial with a rational root. This makes the method rigorous and 
slightly simpler since we need a bound on ||<?(:r,y)|| and not on \\g(xX,yY)\\ if 
X and Y are bounds on the roots and therefore the resulting lattice has smaller 
determinant than in the classical bivariate approach. 


Factoring pq 2 with Quadratic Forms: Nice Cryptanalyses 479 


To evaluate the maximum of the bound we can obtain, we need the size of 
the first vector provided by LLL which is given by: 

Lemma 2 (LLL). Let L be a full-rank lattice in Z d spanned by an integer ba- 
sis B = {bi , ... ,bd}- The LLL algorithm, given B as input, will output in time 
0(d e log 3 (max ||&i||)) a non-zero vector u G L satisfying ||u|| < 2^ _1 ^ 4 det (L) 1//d . 

We will now prove the following general result regarding the modular roots of 
bivariate homogeneous polynomials which can be of independent interest. 

Theorem 2. Let f(x, y) G Z[x, y\ be a homogeneous polynomial of degree 6 with 
f(x, 0) = x s , N be a non-zero integer and a be a rational number in [0, 1], then 
one can retrieve in polynomial time in log N, 5 and the bit-size of a, all the 
rationals xo/yo, where xq and yo are integers such that gcd(f(xo,yo),N) > N a 
and |so|,|yo| < iV“ 2/(2<5) . 

Proof. Let b be a divisor of N for which their exists (xq. yo) G Z 2 such that 
b = gcd(f(xo, yo), N) > N a . We define some integral parameters (to be specified 
later) m, t and t' with t = m + t' and construct a family of St + 1 homogeneous 
polynomials g and h of degree St, such that (xo.yo) is a common root modulo 
b m . More precisely, we consider the following polynomials 

( gi,j(x , y) = x ftfffn-i for i = 0, . . . , m — 1, j = 0, . . . , 5 — 1 
\ hi{ x, y) = x l y st '~ l f m for i = 0, . . . , 5t'. 

We build the triangular matrix L of dimension St + 1, containing the coeffi- 
cients of the polynomials gij and hi. We will apply LLL to the lattice spanned 
by the rows of L. The columns correspond to the coefficients of the monomials 
y st , xy 5 *- 1 ,. . . , x 5t ~ x y, x 5t . Let 0 G [0, 1] such that M = N 0 . The product of the 
diagonal elements gives det(L) = A r<5m ( m+1 )/ 2 . If we omit the quantities that do 
not depend on N, to satisfy the inequality of Lemma Q with the root bound M, 
the LLL bound from Lemma ED implies that we must have 

Sm(m + l)/2 < (St+ 1) (am — St/3) (3) 

and if we set A such that t = Am, this gives asymptotically f3 < ^ which 

is maximal when A = T, and in this case, /3 m ax = a 2 / (25). The vector output 
by LLL gives a homogeneous polynomial f(x,y) such that f(xo, yo) = 0 thanks 
to Lemma [D Let r = x/y, any rational root of the form xo/yo can be found by 
extracting the rational roots of f(r) = l/y St f(x, y) with classical methods. □ 

For the case we are most interested in, S = 2, N = pq 2 with p and q of the 
same size, i. e., a = 2/3 then A = 3/2 and we can asymptotically get roots up 
to N@ with fS = If we take m = 4 and t = 6, i. e., we work with a lattice of 
dimension 13, we get from 0 that 0 ~ TiToo an( l with a 31-dimensional lattice 
(m = 10 and t = 15), 0 sss If the size of q grows compared to p, i. e., a 
increases towards 1, then 0 increases towards 1/4. For example, if q is two times 
larger than p, i. e., a = 4/5 then 0 = 1/6.25. For a = d/7, we get 0 bs 1/5.44. 
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We will call HomogeneousCoppersmith the algorithm which implements this 
method. It takes as input an integer N = pq 1 2 and a binary quadratic form [a, b, c], 
from which we deduce the unitary polynomial x 2 + b'xy+dy 2 , by dividing both b 
and c by a modulo N, and the parameters m and t. In fact, this method will only 
disclose proper representations of q 2 , those for which x and y are coprime, but 
we note that fk properly represents q 2 * , and therefore so does our form [a, b, c]. 

The case a = 1 of Theorem 2 can already be found in Joux’s book j.lou()9j and 
we mention that a similar technique has already been independently investigated 
by Bernstein in |Ber()8| . 

4 A 0(p 1,/2 )-Deterministic Factoring Algorithm for pq 2 * 

We detail our new quadratic form-based factoring algorithm for numbers of the 
form pq 2 . In this section, p and q will be of same bit-size, and p = 1 (mod 4). 


4.1 The Algorithm 

Roughly speaking, if A q = N = pq 2 , our factoring algorithm, depicted in Fig. 0 
exploits the fact that the non-reduced forms fk = [q 2 , kq, — ] reduce to forms 
fk for which there exists a small pair (xq, yo) such that q 2 \ fk(xo- Ho) while 
q 2 | N. From Theorem 0 we know that these reduced forms appear on the 
principal cycle of the class group of discriminant A q . To detect them, we start a 
walk in the principal cycle from the principal form ljv, and apply Rho until the 
Coppersmith-like method finds these small solutions. 


Input: N = pq 2 ,m,t 

Output: p, q 

1. h <— In 

2. while (.To, yo) not found do 

2.1. h*- Rho (h) 

2.2. xo/yo •*— HomogeneousCoppersmith (h, N, m,t) 

3. q <- Sqrt(Gcd(h(a:o, y 0 ), N)) 

4. return ( N/q 2 ,q ) 


Fig. 3. Factoring N = pq 2 


4.2 Heuristic Correctness and Analysis of Our Algorithm 

Assuming Heuristic El starting from ljv, after 0(R P ) iterations, the algorithm 
will stop on a reduced form whose roots will be found with our Coppersmith- 
like method (for suitable values of m and t) since they will satisfy the ex- 
pected A 1 / 9 bound. The computation of gcd(h(a;o, yo), N) will therefore expose 

q 2 and factor N. The time complexity of our algorithm is then heuristically 

0(i?pPoly(log N)), whereas the space complexity is 0(log N). The worst-case 
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complexity is 0(p 1 ^ 2 logp Poly(log TV)). For small regulators, such as in REAL- 
NICE cryptosystem (see. Subsection 5.1), the time complexity is polynomial. 

This algorithm can be generalised with a few modifications to primes p such 
that p = 3 (mod 4), by considering A q = Apq' 2 . Moreover if the bit-sizes of p and 
q are imbalanced, our experiments suggest that the size of the roots will be small 
enough (see end of Subsection 2.3 and Section 3), so the factoring algorithm will 
also work in this case, with the same complexity. 

Comparison with other Deterministic Factorisation Methods. Boneh, 
Durfee and Howgrave-Graham presented in an algorithm for factoring 

integers N = p r q. Their main result is the following: 

Lemma 3 ( |BDH99j b Let N = p r q be given, and assume q < p c for some c. 
Furthermore, assume that P is an integer satisfying \P — p\ < Then 

the factor p may be computed from N, r, c and P by an algorithm whose running 
time is dominated by the time it takes to run LLL on a lattice of dimension d. 

For r = 2 and c = 1, this leads to a deterministic factoring algorithm which 
consists in exhaustively search for an approximation P of p and to solve the 
polynomial equation (P + X) 2 = 0 (mod p 2 ) with a method a la Coppersmith. 
The approximation will be found after 0(p 1//3 ) = 0(N G 9 ) iterations. 

The fastest deterministic generic integer factorisation algorithm is actually a 
version of Strassen’s algorithm |Str7tij from Bostan, Gaudry and Schost |HGS07j , 
who ameliorates a work of Chudnovsky and Chudnovsky jGG&Vj and proves a 
complexity of 0(Mi nt (v/]Vlog N)) where Mj nt is a function such that integers of 
bit-size d can me multiplied in M mt (<i) bit operations. More precisely, for numbers 
of our interest, Lemma 13 from |BGS07j gives the precise complexity: 

Lemma 4 ( |B(4S07} ). Let b,N be two integers with 2 < b < N. One can 
compute a prime divisor of N bounded by b, or prove that no such divisor ex- 
ists in O (^ M int (Vb log TV) + log bM int (}og TV) log log A r j bit operations and space 
0(VblogN) bits. 

In particular, for b = TV 1 / 3 , the complexity is O (TV 1 / 6 ) , with a very large space 
complexity compared to our algorithm. Moreover, none of these two last of al- 
gorithms can actually factor an integer of cryptographic size. The fact that a 
prime divisor has a small regulator does not help in these algorithms, whereas 
it makes the factorisation polynomial in our method. 

5 Cryptanalysis of the NICE Cryptosystems 

Hartmann, Paulus and Takagi proposed the elegant NICE encryption scheme 
(sec jH P' I ?)?)IIP I l)fillP > TflO| ) , based on imaginary quadratic fields and whose main 
feature was a quadratic decryption time. Later on, several other schemes, includ- 
ing (special) signature schemes relying on this framework have been proposed. 
The public key of these NICE cryptosystems contains a discriminant A q = —pq 2 
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together with a reduced ideal Ij whose class belongs to the kernel of (p q . The 
idea underlying the NICE cryptosystem is to hide the message behind a random 
element [lj] r of the kernel. Applying (p q will make this random element disappear, 
and the message will then be recovered. 

In |.ISW08j . Jacobson, Scheidler and Weimer embedded the original NICE 
cryptosystem in real quadratic fields. Whereas the idea remains essentially the 
same as the original, the implementation is very different. The discriminant is 
now A q = pq 2 , but because of the differences between imaginary and real setting, 
these discriminant will have to be chosen carefully. Among these differences, the 
class numbers are expected to be small with very high probability (see the Cohen- 
Lenstra heuristics |CL84| b Moreover, an equivalence class does not contain a 
unique reduced element anymore, but a multitude of them, whose number is 
governed by the size of the fundamental unit. The rough ideas to understand 
these systems and our new attacks are given in the following. The full description 
of the systems is omitted for lack of space but can be found in IHEniEIEMIBI. 


5.1 Polynomial-Time Key Recovery in the Real Setting 

The core of the design of the REAL- NICE encryption scheme is the very particular 
choice of the secret prime numbers p and q such that Ak = p and A q = pq 2 . 
They are chosen such that the ratio RaJRa k is of order of magnitude of q 
and that Ra k is bounded by a polynomial in log(A^). To ensure the first 
property, it is sufficient to choose q such that q — is a small multiple of 

a large prime. If the second property is very unlikely to naturally happen since 
the regulator of p is generally of the order of magnitude of y/p, it is indeed 
quite easy to construct fundamental primes with small regulator. The authors 
of |.JSWU8j suggest to produce a prime p as a so-called Schinzel sleeper, which 
is a positive squarefree integer of the form p = a 2 x 2 + 2 bx + c with a, b, c, x 
in Z, a ^ 0 and b 2 — 4 ac dividing 4gcd(a 2 , 6) 2 . Schinzel sleepers are known to 
have a regulator of the order log(p) (see jCW05j h Some care must be taken 
when setting the (secret) a, b, c, x values, otherwise the resulting A q = pq 2 is 
subject to factorisation attacks described in jVVeidlj . We do not provide here 
more details on these choices since the crucial property for our attack is the fact 
that the regulator is actually of the order log(p). The public key consists of the 
sole discriminant A q . The message is carefully embedded (and padded) into a 
primitive Oa 9 -ideal so that it will be recognised during decryption. Instead of 
moving the message ideal m to a different equivalence class (like in the imaginary 
case), the encryption actually hides the message in the cycle of reduced ideal 
of its own equivalent class by multiplication of a random principal Oa q -ideal 1) 
(computed during encryption). The decryption process consists then in applying 
the (secret) map (p q and perform an exhaustive search for the padded message in 
the small cycle of <^ g ([mj]). This exhaustive search is actually possible thanks to 
the choice of p which has a very small regulator. Like in the imaginary case, the 
decryption procedure has a quadratic complexity and significantly outperforms 
an RSA decryption for any given security level (see Table 3 from j.lSW08| h 
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Unfortunately, due to the particular but necessary choice of the secret prime p, 
the following result states the total insecurity of the REAL-NICE system. 

Result 1. Algorithm\f&reeovers the secret key of REAL-NICE in polynomial time 
in the security parameter under Heuristic^since the secret fundamental discrim- 
inant p is chosen to have a regulator bounded by a polynomial in logp. 

We apply the cryptanalysis on the following example. The Schinzel polynomial 
S(X) = 2725 2 X 2 + 2 ■ 3815X + 2 produces a suitable 256-bit prime p for the 
value X 0 = 103042745825387139695432123167592199. This prime has a regula- 
tor Ra k — 90.83. The second 256-bit prime q is chosen following the recommen- 
dations from jWei()4j . This leads to a the discriminant 

A q = 28736938823310044873380716142282073396186843906757463274792638734144060602830510 
80738669163489273592599054529442271053869832485363682341892124500678400322719842 
63278692833860326257638544601057379571931906787755152745236263303465093 

Our algorithm recovers the prime 

q = 60372105471499634417192859173853663456123015267207769653235558092781188395563 
from A q after 45 iterations in 42.42 seconds on a standard laptop. The rational 



5.2 Polynomial-Time Key Recovery of the Original NICE 

As mentioned above, the public key of the original NICE cryptosystem contains 
the representation of a reduced ideal \ whose class belongs to the kernel of the 
surjection (p q . The total- break of the NICE cryptosystem is equivalent to solving 
the following kernel problem. 

Definition 6 (Kernel Problem (6PT()4| ) . Let A be an integer, p and q be 

two X-bit primes with p = 3 (mod 4) . Fix a non-fundamental discriminant A q = 
—pq 2 . Given an element [5] ofker(p q , factor the discriminant A q . 

Castagnos and Laguillaumie proposed in f( T09j a polynomial-time algorithm to 
solve this problem. We propose here a completely different solution within the 
spirit of our factorisation method and whose complexity is also polynomial-time. 
As discuss in Subsection 2.3, the idea is to benefit from the fact that the public 
ideal \ corresponds to a reduced quadratic form, /&, which represents q 2 . We thus 
find these xo and yo such that gc.d(Jf (xo, yo), A q ) = q 2 with the Coppersmith 
method of Section 0 

Result 2. The Homogeneous Coppersmith method from Section 0 solves the 
Kernel Problem in polynomial time in the security parameter under Heuristic Q| 


484 G. Castagnos et al. 


We apply our key recovery on the example of NICE proposed in |,l. 1001 071^: 

A q = -1001133619402846750073919037082619174565372425946674915149340539464219927955168 
18216760083640752198709726199732701843864411853249644535365728802022498185665592 
98370854645328210791277591425676291349013221520022224671621236001656120923 

a = 5702268770894258318168588438117558871300783180769995195092715895755173700399 
141486895731384747 

6 = 3361236040582754784958586298017949110648731745605930164666819569606755029773 
074415823039847007 

The public key consists in A q and I) = (a, b). Our Coppersmith method finds 
in less that half a second the root uq = an( ] 

h(x 0 ,yo) = 5363123171977038839829609999282338450991746328236957351089 
4245774887056120365979002534633233830227721465513935614971 
593907712680952249981870640736401120729 = q 2 . 

All our experiments have been run on a standard laptop under Linux with 
software Sage. The lattice reduction have been performed with Stehle’s fplll jStej . 
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Abstract. We look at iterated power generators s; = s|_! mod N for a 
random seed so £ Zj v that in each iteration output a certain amount of 
bits. We show that heuristically an output of (1 — J) log N most signifi- 
cant bits per iteration allows for efficient recovery of the whole sequence. 
This means in particular that the Blum-Blum-Shub generator should be 
used with an output of less than half of the bits per iteration and the 
RSA generator with e = 3 with less than a ^-fraction of the bits. 

Our method is lattice-based and introduces a new technique, which 
combines the benefits of two techniques, namely the method of lineariza- 
tion and the method of Coppersmith for finding small roots of polynomial 
equations. We call this new technique unravelled linearization. 

Keywords: power generator, lattices, small roots, systems of equations. 


1 Introduction 

Pseudorandom number generators (PRGs) play a crucial role in cryptography. 
An especially simple construction is provided by iterating the RSA function 
Si = sf_ 1 mod N for an RSA modulus N = pq of bit-size n and a seed sq G Zjy. 
This so-called power generator outputs in each iteration a certain amount of 
bits of Si, usually the least significant bits. In order to minimize the amount of 
computation per iteration, one typically uses small e such as e = 3. With slight 
modifications one can choose e = 2 as well when replacing the iteration function 
by the so-called absolute Rabin function |3l4j . where s 2 mod N is defined to be 
min{s 2 mod N,N — s 2 mod N}, N is a Blum integer and so is chosen from 
{0, . . . , with Jacobi symbol +1. 

It is well-known that under the RSA assumption one can safely output up 
to (9 (log n) = <9 (log log N) bits per iteration jliSj . At Asiacrypt 2006, Stein- 
feld, Pieprzyk and Wang [Oj showed that under a stronger assumption regard- 
ing the optimality of some well-studied lattice attacks, one can securely output 

* This research was supported by the German Research Foundation (DFG) as part 
of the project MA 2536/3-1 and by the European Commission through the ICT 
programme under contract ICT-2007-216676 ECRYPT II. 
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(5 — \ — e — o(l))n bits. The assumption is based on a specific RSA one-wayness 
problem, where one is given an RSA ciphertext c = m e mod N together with 
a certain fraction of the plaintext bits of m, and one has to recover the whole 
plaintext m. We call this generator the SPW generator. The SPW generator has 
the desirable property that one can output a constant fraction <7 (log A) of all 
bits per iteration. Using an even stronger assumption, Steinfeld, Pieprzyk and 
Wang could improve the output size to (5 — ^ — e — o(l))n bits. 

A natural question is whether the amount of output bits of the SPW gener- 
ator is maximal. Steinfeld et al.’s security proof uses in a black-box fashion the 
security proof of Fischlin and Schnorr for RSA bits jH|. This proof unfortunately 
introduces a factor of \ for the output rate of the generator. So, Steinfeld et 
al. conjecture that one might improve the rate to (1 — i — e)n using a different 
proof technique. Here, e is a security parameter and has to be chosen such that 
performing 2 en operations is infeasible. We show that this bound is essentially 
the best that one can hope for by giving an attack up to the bound (1 — i}». 

In previous cryptanalytic approaches, upper bounds for the number of output 
bits have been studied by Blackburn, Gomez-Perez, Gutierrez and Shparlin- 
ski 0 . For e = 2 and a class of PRGs similar to power generators (but with 
prime moduli), they showed that provably | n bits are sufficient to recover the 
secret seed sq. As mentioned in Steinfeld et al., this bound can be generalized 
t° (1 — using the heuristic extension of Coppersmith’s method jZj to mul- 

tivariate equations. 

Our contribution: We improve the cryptanalytic bound to (1— -)n bits using a 
new heuristic lattice-based technique. Notice that the two most interesting cases 
are e = 2,3, the Blum-Blum-Shub generator and the RSA generator. For these 
cases, we improve on the best known attack bounds from |n to and from | n 
to | n, respectively. Unfortunately — similar to the result of Blackburn et al. [ 2 | 
— our results are restricted to power generators that output most significant 
bits in each iteration. It remains an open problem to show that the bounds hold 
for least significant bits as well. 

Our improvement comes from a new technique called unravelled linearization, 
which is a hybrid of lattice-based linearization (see C3| for an overview) and 
the lattice-based technique due to Coppersmith JJJ. Let us illustrate this new 
technique with a simple example. Assume we want to solve a polynomial equation 
x 2 + ax + b = y mod A for some given a,b £ Z,y and some unknowns x, y. 
This problem can be considered as finding the modular roots of a univariate 
polynomial f(x) = x 2 + ax + b with some error y. 

It is a well-known heuristic that a linear modular equation can be easily solved 
by computing a shortest lattice vector, provided that the absolute value of the 
product of the unknowns is smaller than the modulus m- In order to linearize 
our equation, we substitute u := x 2 and end up with a linear equation in u, x, y. 
This can be solved whenever \uxy\ < A. If we assume for simplicity that the 
unknowns x, y are of the same size, this yields the condition \x\ < AX 

However, in the above case it is easy to see that this linearization is not 
optimal. A better linearization would define u:= x 2 — y, leaving us with a linear 
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equation in u, x only. This yields the superior condition \x\ < Ni. So one benefits 
from the fact that one can easily glue variables together, in our case x 2 and y, 
whenever this does not change the size of the larger variable. In our example 
this would also work when y had a known coefficient c of size |c| ~ \y\. 

The main benefit from the attack of Blackburn et al. 0 comes from a clever 
linearization of the variables that occur in the case of power generators. While 
on the one hand such a linearization of a polynomial equation offers some ad- 
vantages, on the other hand we lose the algebraic structure. Performing e.g. the 
substitution u := x 2 , one obtains a linear equation in it, x. y but the property 
that u and x are algebraically dependent — one being the square of the other 
— is completely lost. Naturally, this drawback becomes more dramatic when 
looking at higher degree polynomials. 

As a consequence, Coppersmith |(il r >17| designed in 1996 a lattice-based method 
that is well-suited for exploiting polynomial structures. The underlying idea is to 
additionally use algebraic relations before linearization. Let us illustrate this idea 
with our example polynomial f(x,y) = x 2 + ax + b— y. We know that whenever / 
has a small root modulo N, then also xf = x 3 +ax 2 +bx—xy shares this root. Using 
xf as well, we obtain two modular equations in five unknowns a; 3 , x 2 , x , y. xy. No- 
tice that the unknowns x 2 and x are re-used in the second equation which reflects 
the algebraic structure. So even after linearizing both equations, Coppersmith’s 
method preserves some polynomial structure. In addition to multiplication of / 
by powers of x and y — which is often called shifting in the literature — one also 
allows for powers /* with the additional benefit of obtaining equations modulo 
larger moduli N l . 

When we compute the enabling condition with Coppersmith’s method for 
our example f(x, y) using an optimal shifting and powering, we obtain a bound 
of | a; | < IV 3. So the method yields a better bound than naive linearization, 
but cannot beat the bound of the more clever linearization with u := x 2 — y. 
Even worse, Coppersmith’s method results in the use of lattices of much larger 
dimension. 

To summarize, linearization makes use of the similarity of coefficients in a 
polynomial equation, whereas Coppersmith’s method basically makes use of the 
structure of the polynomial’s monomial set. 

Motivation for unravelled linearization: Our new technique of unravelled 
linearization aims to bring together the best of both worlds. Namely, we al- 
low for clever linearization but still exploit the polynomial structure. Unravelled 
linearization proceeds in three steps: linearization, basis construction, and un- 
ravellation. Let us illustrate these steps with our example fix, y), where we use 
the linearization u := x 2 — y in the first step. In this case, we end up with a 
linear polynomial g(u,x). Similar to Coppersmith’s approach, in the second step 
we use shifts and powers of this polynomial. E.g., g 2 defines an equation in the 
unknowns u 2 ,ux,x 2 ,u,x modulo N 2 . But since we start with a linear polyno- 
mial g, this alone will not bring us any benefits, because the algebraic structure 
got lost in the linearization process from / to g. 
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Therefore, in the third step we partially unravel the linearization for g 2 using 
the relation x 2 = y + u. The unravelled form of g 2 defines a modular equation in 
the unknowns u 2 ,ux,y,u,x , where we basically substitute the unknown x 2 by 
the unknown y. Notice here, that we can reuse the variable u which occurs in g 2 
anyway. This substitution leads to a significant gain, since y is much smaller in 
size than x 2 . 

In the present paper, we elaborate on this simple observation that unravelling 
of linearization brings benefits to lattice reduction algorithms. We use the equa- 
tions that result from the power generator as a case study for demonstrating the 
power of unravelled linearization, but we are confident that our new technique 
will also find new applications in various other contexts. 

The paper is organized as follows. In Section 0 we will fix some very basic 
notions for lattices. In Section 0 we define our polynomials from the power 
generator with e = 2 and give a toy example with only two PRG iterations 
that illustrates how unravelled linearization works. This already leads to an 
improved bound of j^n. In Section 0 we generalize to arbitrary lattice dimension 
(bound | n) and in Section 0 we generalize to an arbitrary number of PRG 
iterations (bound ^n). In Section 0 we finally generalize to an arbitrary exponent 
e. Since our attacks rely on Coppersmith- type heuristics, we verify the heuristics 
experimentally in Section 0 

2 Basics on Lattices 

Let bi, . . . , bd g be linearly independent. Then the set 



is called a lattice L with basis matrix B g Q dxd , having the vectors bi, . . . , bd as 
row vectors. The parameter d is called the lattice dimension, denoted by dirri(L). 
The determinant of the lattice is defined as det(L) := | det(R)|. 

The famous LLL algorithm m computes a basis consisting of short and pair- 
wise almost orthogonal vectors. Let vi , . . . , v<j be an LLL-reduced lattice basis 
with Gram-Schmidt orthogonalized vectors v£, . . . , v^. Intuitively, the property 
of pairwise almost orthogonal vectors Vi , . . . , Vd implies that the norm of the 
Gram-Schmidt vectors , . . . , v £ cannot be too small. This is quantified in the 
following theorem of Jutla [OJ that follows from the LLL paper [0 . 

Theorem 1 (LLL). Let L be a lattice spanned by B g Q dxd . On input B, the 
L 3 -algorithm outputs an LLL-reduced lattice basis {vi, . . . , Vd} with 



for i = 1, . . . , d 


in time polynomial in d and in the bit-size of the largest entry b max of the basis 
matrix B. 


Attacking Power Generators Using Unravelled Linearization 491 


3 Power Generators with e — 2 and Two Iterations 


Let us consider power generators defined by the recurrence sequence 
Si = sf_i mod N, 

where N is an RSA modulus and sq £ Zjv is the secret seed. 

Suppose that the power generator outputs in each iteration the most signifi- 
cant bits ki of Si, i.e. Sj = ki + Xi, where the k t are known for i > 1 and the Xi 
are unknown. 

Our goal is to recover all Xi for a number of output bits ki that is as small 
as possible. In other word, if we define Xi < N s then we have to find an attack 
that maximizes 6. 

Let us start with the most simple case of two iterations and e = 2. The best 
known bound is S = | due to Blackburn et al. j2]. We will later generalize to an 
arbitrary number of iterations and also to an arbitrary e. 

For the case of two iterations, we obtain 


si = ki + x\ and S 2 = k% + X 2 , 


for some unknown Sj, Xj. The recurrence relation of the generator S 2 = S'i mod N 
yields kg + £2 = (fci + £'i) 2 mod N, which results in the polynomial equation 


— X 2 + 2fci x\ + k 2 — k 2 = 0 mod N. 


Thus, we search for small modular roots of f(x i, £ 2 ) = x\ — £2 + ax 1 + b modulo 
N. 

Let us first illustrate our new technique called unravelled linearization with a 
small-dimensional lattice attack before we apply it in full generality in Section @] 

Step 1: Linearize f(x 1 , 0 : 2 ) into g. 

We make the substitution u := x\ — x 2 . This leaves us with a linear polynomial 
g{u, £ 1 ) = u + ax 1 + b. 

Step 2: Basis construction. 

Defining standard shifts and powers for g is especially simple, since g is a linear 
polynomial. If we fix a total degree bound of m = 2, then we choose g, xg and g 2 . 

Let X := N s be an upper bound for x\ , £ 2 - Then U := N 2S is an upper bound 
for u. The choice of the shift polynomials results in a lattice L spanned by the 
rows of the lattice basis B depicted in Figure [B 

Let (uo,£o) be a root of g. Then the vector v = (l,£o,£o,uo,uo£o,Uo, 
k\,k 2 ,k$)B has its right-hand three last coordinates equal to 0 for suitably cho- 
sen ki £ Z. Hence we can write v as v = (1, . . . , ^-,0,0,0). Since |uo| < U 

and |£o| < X, we obtain ||v| < y/6. 

To summarize, we are looking for a short vector v in the 6-dimensional sublat- 
tice L' = L n (Q 6 xO 3 ) with ||v|| < -^/dim (Z/). Let bi, . . . , b 0 be an LLL-reduced 
basis of L' with orthogonalized basis b£, . . . ,bg. Coppersmith jZj showed that 
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/I 


V 


g x ig g 
b b 2 \ 

a b ab 


Fig. 1. After linearization and standard shifts and powers for m = 2 


any vector v £ L' that is smaller than bg must lie in the sub-space spanned 
by bi, . . . ,b 5 , i.e. v is orthogonal to bg. This immediately yields a coefficient 
vector of a polynomial h{u,x i), which has the same roots as g(u, X\), but over 
the integers instead of modulo N. Assume that we can find two such polynomials 
hi, h, 2 , then we can compute all small roots by resultant computation provided 
that hi , /12 do not share a common divisor. The only heuristic of our method is 
that the polynomials hi, h% are indeed coprime. 

By the LLL-Theorem (Theorem GJ, an orthogonalized LLL-basis contains a 
vector bg in L’ with ||bg|| > c(d) det(L')i , where c(d) = 2^. Thus, if the 
condition 

c(d) det(L , )3 > \fd 

holds, then v = (l,^,...,^) will be orthogonal to the vector bg. 

Since det(L') is a function of N, we can neglect d = dim(L') for large enough 
N. This in turn simplifies our condition to 

det(T') > 1. 

Moreover, one can show by a unimodular transformation of B that det ( L' ) = 
det(T). 

For our example, the enabling condition det(L) > 1 translates to U 4 X 4 < N 4 . 
Plugging in the values of X := N s and U := N 2S , this leads to the condition 
S < Notice that this is exactly the condition from Blackburn et al. j2j. Namely, 
if the PRG outputs | n bits per iteration, then the remaining bits can be found 
in polynomial time. 

We will now improve on this result by unravelling the linearization of g. 
Step 3: Unravel g’s linearization. 

We unravel the linearization by back-substitution of x\ = u + x 2 . This slightly 
changes our lattice basis (see Fig. El). 

The main difference is that the determinant of the new lattice L u increases 
by a factor of X. Thus our enabling condition det(T„) > 1 yields U 4 X :i < N 4 
or equivalently 5 < This means that if the PRG outputs of the bits in 
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g * 1 g g 1 

b b 2 N 


Fig. 2. After unravelling the linearization 


each of two iterations, then we can reconstruct the remaining j^n bits of both 
iterations in polynomial time. This beats the previous bound of |n. 

We would like to stress again that our approach is heuristic. We construct 
two polynomials hi , h 2 0 The polynomials hi , h 2 contain a priori three variables 
xi,x 2 ,u, but substituting u by x\ — x 2 results in two bivariate polynomials 
h'i,h' 2 - Then, we hope that h! x and h 2 are coprime and thus allow for efficient 
root finding. We verified this heuristic with experiments in Section 0 

4 Generalization to Lattices of Arbitrary Dimension 

The linearization step from /( xi , x 2 ) to g(u. xi) is done as in the previous section 
using u := x\ — x 2 . For the basis construction step, we fix an integer m and define 
the following collection of polynomials 

g itj (u,xi) := x[g l (u,xi) for i= 1, . . . , m and j = 0, . . . ,m - i. (1) 

In the unravelling step, we substitute each occurrence of x\ by u + x 2 and 
change the lattice basis accordingly. It remains to compute the determinant of 
the resulting lattice. This appears to be a non-trivial task due to the various 
back-substitutions. Therefore, we did not compute the lattice determinant as a 
function of m by hand. Instead, we developed an automated process that might 
be useful in other contexts as well. 

We observe that the determinant can be calculated by knowing first the prod- 
uct of all monomials that appear in the collection of the g^j after unravelling, 
and second the product of all N. Let us start with the product of the N, since 
it is easy to compute from Equation (0) : 

ni]W = J J N (m+l)i-i 2 = N ^m 3 +o(m 3 )_ 


The polynomial h 2 can be constructed from bj_ x with a slightly more restrictive 
condition on det(L) coming from Theorem 0 However, in practical experiments 
the simpler condition det(L) > 1 seems to suffice for h 2 as well. In the subsequent 
chapters, this minor detail is captured by the asymptotic analysis. 
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Now let us bound the product of all monomials. Each variable xi,x % , u appears 
in the unravelled form of (jij with power at most 2m. Therefore, the product of 
all monomials that appear in all \m 2 + o(m 2 ) polynomials has in each variable 
degree at most to 3 . Thus, we can express the exponent of each variable as a 
polynomial function in to of degree 3 with rational coefficients — similar to the 
exponent of N. 

But since we know that the exponents are polynomials in m of degree at 
most 3, we can uniquely determine them by a polynomial interpolation at 4 
points. Namely, we explicitly compute the unravelled basis for to = 1, . . . , 4 and 
count the number of variables that occur in the unravelled forms of the Qi.j . 
From these values, we interpolate the polynomial function for arbitrary m. 

This technique is much less error-prone than computing the determinant func- 
tions by hand and it allows for analyzing very complicated lattice basis struc- 
tures. Applying this interpolation process to our unravelled lattice basis, we 
obtain det(T) = X~ Pl ^ rn ^U~ P2 ^ m ^N P3 ^ m ^ with 

Pi(m) = -j-m 3 + o(m 3 ), p 2 (m) = ^to 3 + o(m 3 ) , p 3 (m) = ^m 3 + o(m 3 ). 

12 b b 

Our condition det(L) > 1 thus translates into | resp. S < Interestingly, 

this is exactly the bound that Blackburn et al. [2] conjectured to be the best 
possible bound one can obtain by looking at two iterations of the PRG. 

In the next section, we will also generalize our result to an arbitrary fixed 
number of iterations of the PRG. This should intuitively help to further improve 
the bounds and this intuition turns out to be true. To the best of our knowledge, 
our attack is the first one that is capable of exploiting more than two equations 
in the contexts of PRGs. 


5 Using an Arbitrary Fixed Number of PRG Iterations 


We illustrate the basic idea of generalizing to more iterations by using three 
iterations of the generator before analyzing the general case. 

Let Si = ki + Xi for i = 1, 2, 3, where the ki are the output bits and the Xi are 
unknown. For these values, we are able to use two iterations of the recurrence 
relation, namely 

S 2 = si mod N 53 = s 2 mod N 
from which we derive two polynomials 



b 2 


We perform the linearization step f\ — > gi and fi — > g -2 by using the substitutions 
u\ := x\ — X2 and U2 := x\ — X3. 
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Fig. 3. Generic lattice basis for 2 polynomials 


In the basis construction step, we have to define a collection for the polyno- 
mials gi(ui,xi) and g 2 (u 2 , x 2 ) using suitable shifts and powers. We will start by 
doing this in some generic but non-optimal way, which is depicted in Figure 01 
for the case of fixed total degree m = 2 in 31, 32- In this basis matrix for better 
readability we leave out the left-hand diagonal consisting of the inverses of the 
upper bounds of the corresponding monomials. 

The reader may verify that the bound obtained from this collection of polyno- 
mials is 6 < A ~ 0.364, which is exactly the same bound as in our starting exam- 
ple in Section 0 A bit surprisingly, our generic lattice basis construction does not 
immediately improve on the bound that we derived from a single polynomial. 

It turns out, however, that we improve when taking just a small subset of the 
collection in Fig. 01 If we only use the shifts 31, 2+31,31 and additionally 32, then 
we obtain a superior bound of S < ^ « 0.385. The reason for the improvement 
comes from the fact that the monomial x 2 of g 2 can be reused as it already 
appeared in the shifts xigi and g\. 

For the asymptotic analysis, we define the following collection of polynomials 


9i,j,k ■= x\g\g 3 2 for < j = 0, . . . , J with i + j> 1. 

[k = 0,...,m-i-2j 
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The intuition behind the definition of this collection of polynomials follows the 
same reasoning as in the example for m = 2. We wish to keep small the number 
of new monomials introduced by the shifts with g 2 . Notice that the monomials x\ 
for i = 0, . . . [y\ already appeared in the g\ shifts — since we back-substituted 
x\ — > u\ + X 2 - Therefore, it is advantageous to use the <? 2 shifts only up to [ • 

With the interpolation technique introduced in Section El we derive a bound 
of <$ < yj for the case of 2 polynomials, i.e. three output values of the generator. 


5.1 Arbitrary Number of PRG Iterations 

Given n + 1 iterations of the PRG, we select a collection of shift polynomials 
following the intuition given in the previous section: 




x i9i ■■■ 9n 




To perform the asymptotic analysis we need to determine the value of the de- 
terminant of the corresponding lattice basis. This means, we have to count the 
exponents of all occurring monomials in the set of shift polynomials. We would 
like to point out that because of the range of the index k , the shifts with x\ 
do not introduce additional monomials over the set defined by the product of 
the gi alone. For this product the monomials can be enumerated as follows (see 
Appendix 0 for a proof): 


i\ <= 0, , : ® ai = 0, 1 

i% = 0, . . . , a 2 =0,l 



We are only interested in the asymptotic behavior, i.e. we just consider the 
highest power of to. We omit the floor function as it only influences a lower 
order term. Analogously, we simplify the exponents of u,j by omitting the value 
a,j, since it is a constant polynomial in m. Furthermore, for the same reason the 
contribution to the determinant of all Xi with i < n can be neglected. 
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To derive the final condition, we have to compute the polynomials Pj(m ) of 
the following expression for the determinant (resp. the coefficients of the highest 
power of m): 


det(L) = 

It seems to be a complicated task to compute these polynomials explicitly. There- 
fore, we follow a different approach and compute the sizes of their leading coeffi- 
cients in relation to each other. This turns out to be enough to derive a bound on 
the sizes of the unknowns. In Appendix IBlwe explain how to derive the following 
expressions for the polynomials: 

1 1 2 " — 1 
PjM = jPiM for j < n, p x (to) = — Pi (m), pjv(to) = 2rt-1 Pi(m), 

where we again omit low order terms. We use these expressions in the enabling 
condition det(L) > 1 and plug in upper bounds X, 1+1 < N 5 and U t < N 2S . It is 
sufficient to consider the condition for the exponents: 

+ jPi(m) < 2 2n _i Pi(m). 

3 = 1 


Simplifying this condition and solving for 6, we obtain 


S < 


2 n+1 - 2 
2 n + 2 - 3 ’ 


which converges for n — y oo to <5 < 


6 Extending to Higher Powers 

In the previous sections, we have considered PRGs with exponent e = 2 only, 
i.e. a squaring operation in the recurrence relation. A generalization to arbitrary 
exponents is straight forward. 

Suppose the PRG has the recurrence relation S 2 = sf mod N. Let, as in 
Section 0 the output of the generator be k \ , hz , i.e. we have s\ = k\ + xi and 
S 2 = k 2 + X 2 , for some unknown s l .x l . 

Using the recurrence relation, this yields the polynomial equation 

x\ — X 2 +ek\x^~ 1 + . . . + ekl~ 1 xi + fc® — k% = 0 mod N. 



The linearization step is analog to the case where e = 2, however, the unravelling 
of the linearization only applies for higher powers of xi, in this case x\. 

The collection of shift polynomials using n PRG iterations is 

9h,...,i n ,k ■■= Xigl 1 ■■■9% 
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= 0, . . . , TO 

= o,...,L^ 


m- E?: 


= 0 ,. 

= 0, . . . , to - E"= i 

Taking a closer look at the analysis in Appendix and IHlshows that the general- 
ization for arbitrary e is straightforward. Working through the analysis we obtain 
for arbitrary e an asymptotic bound for an arbitrary number of polynomials of 
S< i 


7 Experiments 

Since our technique uses a heuristic concerning the algebraic independence of the 
obtained polynomials, we have to experimentally verify our results. Therefore, 
we implemented the unravelled linearization using SAGE 3.4.1. including the L 2 
reduction algorithm from Nguyen and Stehle H2| In Table □ some experimental 
results are given for a PRG with e = 2 and 256 bit modulus N. 


Table 1 . Experimental Results for e = 2 


polys 

m 

S 

exp. 5 

dim(L) 

time(s) 

1 

4 

0.377 

0.364 

15 

1 

1 

6 

0.383 

0.377 

28 

5 

1 

8 

0.387 

0.379 

45 

45 

2 

4 

0.405 

0.390 

22 

10 

2 

6 

0.418 

0.408 

50 

1250 

3 

4 

0.407 

0.400 

23 

5 


In the first column we denote the number of polynomials. The second column 
shows the chosen parameter to, which has a direct influence on how close we 
approach the asymptotic bound. On the other hand, the parameter to increases 
the lattice dimension and therefore the time required to compute a solution. The 
theoretically expected S is given in the third column, whereas the actually verified 
S is given in the fourth column. The last column denotes the time required to 
find the solution on a Core2 Duo 2.2 GHz running Linux 2.6.24. 

It is worth mentioning that most of the time to find the solution is not spend 
on doing the lattice reduction, but for extracting the common root from the 
set of polynomials using resultant computations. The resultant computations 
yielded the desired solutions of the power generators. 

Acknowledgement. We would like to thank Dan Bernstein for bringing this 
research topic to our attention during an Ecrypt meeting. 
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A Describing the Set of Monomials 
Theorem 1 Suppose we have n polynomials of the form 
fi{x h aq+i) = xf + aiXi + b t - x i+ i 
and define the collection of polynomials 

fi 1 ■■■fir for 


it = 0, . . . , m 
i 2 = 0 , . . . , 
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After performing the substitutions xf i— » Ui + x^i, the set of all occurring mono- 
mials can be described as 

ai tn-i-On-1 ,,i n -2b„-a n „b n 

X 1 ■■■X n u 1 ...U n _ 1 U n x n~l 


«i = 0,.. 

. , m 

ai = 0, 1 

h = 0, . . 


a2 = 0, 1 



b n = o 


Proof. By induction: Basic step: n = 1 

For one polynomial /i(xi, xf) = x\ + a\X\ + hi — x -2 we perform the substitution 
x\ m *■ u\ + X 2 to obtain g\(u\,x{) = u-y + ayXy + b\. The set of all monomials 
that are introduced by the powers of g\{u\,xf) can be described as 



It remains to perform the substitution on this set. Therefore, we express the 
counter j i by two counters ai and b\ and let ji = 2bi + oi, i.e. we write the set 
as 

! i\ = 0, . . . , m 

ai =0,1 

b, 

Imagine that we enumerate the monomials for fixed i \ , (i\ and increasing b \ , 
and simultaneously perform the substitution x\ u-y + x- 2 - The key point to 
notice is that all monomials that occur after the substitution, i.e. all of (ui + 
x 2 ) bl Xi 1 u q f~' 2bl ~ ax , have been enumerated by a previous value of by, except for 
the single monomial x^x® 1 u z f~ 2bl ~ ai . 

Thus, the set of monomials after the substitution can be expressed as 


{ i\ » 0, ...... ,m 

ai =0,1 

b, 

This concludes the basic step. 

Inductive Step: n — 1 — > n 

Suppose the assumption is correct for n — 1 polynomials. By the construction of 
the shift polynomials and the induction hypothesis, we have the set of monomials 

ai a n _i i-i — q,i in— 2 — ci n _ 2 in — 1 — 26 n _ 1 — a n _i J) 1 j % — j 

X 1 ‘ ‘ x n - 1 U 1 • ‘ ‘ u n - 2 U n - 1 X n~ X n U n 


Hypothesis 
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ii = 0, . . . ,m ai = 0, 1 

*2 = 0,..., [ m 2 j (*2 = 0,1 



By adding the n-th polynomial, we also get the new relation x 2 n = u n + x n+ -\ . 
Before performing the substitutions, however, we have to take a closer look at 
the powers of x n . The problem seems to be that we have a contribution from 
the n-th polynomial as well as from some previous substitutions. It turns out 
that this can be handled quite elegantly. Namely, we will show that all occurring 
monomials are enumerated by just taking b n - 1 = 0. 

Consider the set of monomials for b n - 1 = c for some constant c > 1: 

x* 1 ... u l "zi “ 2c_ ° n - 1 x 3 ™ +c for j n G {0, . . . , i n }. 


Exactly the same set of monomials is obtained by considering the index i' n _\ = 
i n - 1 — 2 and b n - 1 = c — 1. Notice that in this case the counter i' n , which serves 
as an upper bound of j' n , runs from 0 through 



■l-l 


_i + 2" _1 | 

L 

J l 

2 n— 1 

J 


Thus, we have the same set of monomials as with b n -\ = c — 1: 

«f ... ^ }'„ € {0, .... «. 

Iterating this argument, we conclude that all monomials are enumerated by 
bn-i = 0. 

Having combined the occurring powers of x n , we continue by performing an 
analog step as in the basic step: introduce a n and b n representing j n . This leads 
to 

X T ■ ■ ■u^r an ~ 1 ( x l) b " x n" u n~ 2bn ~ an 

*i = 0, f m ai = 0, 1 

*2 = 0 , . . . , 02 = 0,1 
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Finally we substitute x\ = u n + x n+ \. Using the same argument as in the basic 
step, we note that new monomials only appear for powers of a; n +i. 

B Relations among Exponent Polynomials 

For the determinant computation we need to sum up the exponents of the oc- 
curring monomials. Take for example ug with l < n: using the description of the 
set from Appendix E] we need to compute 

EE l Ei^ E ^EeV-). 

*1=0 ai=0 *2=0 c*2=0 i„=0 a n = 0 b„= 0 

We will step by step simplify this expression using the fact that in the asymptotic 
consideration only the highest power of the parameter m is important. 

In the first step we notice that we may remove the —ag from the summation, 
because ag does not depend on m, while ig does. Therefore, the ag just affects 
lower order terms. With the same argument we can omit the a n in the upper 
bound of the sum over b n . Further, the floor function in the limit of the sums 
does only affect lower order terms and therefore may be omitted. Next, we can 
move all the sums of the a* to the front, since they are no longer referenced 
anywhere, and replace each of these sums by a factor of 2, making altogether a 
global factor of 2 n . 

For further simplification of the expression, we wish to eliminate the fractions 
that appear in the bounds of the sums. To give an idea how to achieve this, 
consider the expression 


EE- 

ii=0 *2=8 

Our intuition is to imagine an index i 2 of the second sum that performs steps 
with a width of 2 and is upper bounded by m — i\. To keep it equivalent, we 
have to compute the sum of over all integers of the form |^J . However, when 
changing the index to i 2 , the sum surely does not perform steps with width 2. 
I.e. we count every value exactly twice. Thus, to obtain a correct reformulation, 
we have to divide the result by 2. Note that asymptotically we may omit the 
floor function and simply sum over 

In the same way we are able to reformulate all sums from i\ to i n . For better 
readability we replaced i£ with ij again. 


•iff- 


™-E”Ti 

E 




It seems to be a complicated task to explicitly evaluate a sum of this form. There- 
fore, we follow a different approach, namely we relate the sums over different ig 
to each other. We start with the discussion of a slightly simpler observation: 
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m m—i ij 

Sums of the form YYTi= o ■ ■ ■ Yi„=o 3-1 ° if are equal for all £<n. 

An explanation can be given as follows. Imagine the geometric object that is 
represented by taking the ij as coordinates in an n-dimensional space. This set 
describes an n-dimensional simplex, e.g. a triangle for n = 2, a tetrahedron for 
n = 3, etc. Considering its regular structure, i.e. the symmetry in the different 
coordinates, it should be clear that the summation over each of the it results in 
the same value. 

In the sum of Equation @ there is an additional inner summation with index 
b n and limit i n / 2". For the indices £ < n this innermost sum is constant for all 
values of £ and thus with the previous argumentation the whole sums are equal 
for all £ < n. We only have to take care of the leading factors, i.e. the powers of 
2 that came from replacing the summation variables. 

This gives us already a large amount of the exponent polynomials in the 
determinant expression. Namely, we are able to formulate the polynomials pt 
(which is the sum over the it) in terms of p\ for all l < n. The difference is 
exactly the factor that has been introduced when changing the index from 
it to i\. 

For the exponent polynomial of the variable u n , however, we have to be careful 
because we do not compute the summation of i n — a n , but of i n / 2 n_1 — 2 b n — a n 
instead (i n /2 n ~ 1 since we changed the summation index i n ). The value — a n can 
be omitted with the same argument as before. To derive a relation of p n to p\ , 
we start by evaluating the inner sums: 


Pn 


Pi : • • • E fc 

i n = o b n = 0 

r -»0 




Notice that once again, for the asymptotic analysis we have only considered the 
highest powers. 

Because of the previously mentioned symmetry between ii and i n , we finally 
derive p n = ■ The same argument can be used to derive the bound on the 

variable x n+ \ for which we have to compute the sum 


& ™-E”Tib , 2 

p x :... £ X> = -'- E it' 

i „= 0 6„=0 i „= 0 

The multiplicative relation between pi and p x is therefore p x = ^pi- 

Finally, to compute the exponent of N in the determinant, we have to sum 
up all exponents that occur in the enumeration of the shift polynomials given in 
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Section lft.il The simplifications are equivalent to the ones used before and we 
obtain: 


We first note that for l <n we may write 

' ■ • E E 1 with c = m-J2 i r 

i „= 0 k = 0 j = 0 

This is asymptotically equivalent to 

• • • ^i if - E E 1 = 2 " • • • • E E 1 = 


For £ = n we argue again that the summations for different if behave the same 
way. Thus it follows | • \ ■ . . . ■ ^ 

. Summing up, we obtain 


Pn = (1 + 


1 1 


■ + 5^r)P: 


2 ” - 1 


j-Pi- 
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Abstract. By a computational puzzle we mean a mildly difficult com- 
putational problem that requires resources (processor cycles, memory, 
or both) to solve. Puzzles have found a variety of uses in security. In 
this paper we are concerned with client puzzles : a type of puzzle used 
as a defense against Denial of Service (DoS) attacks. The main contri- 
bution of this paper is a formal model for the security of client puzzles. 
We clarify the interface that client puzzles should offer and give two se- 
curity notions for puzzles. Both functionality and security are inspired 
by, and tailored to, the use of puzzles as a defense against DoS attacks. 
Our definitions fill an important gap: breaking either of the two proper- 
ties immediately leads to successful DoS attacks. We illustrate this point 
with an attack against a previously proposed puzzle construction. We 
also provide a generic construction of a client puzzle which meets our 
security definitions. 


1 Introduction 

A Denial of Service (DoS) attack on a server aims to render it unable to provide 
some service by depleting its internal resources. For example, the famous TCP- 
SYN flooding attack 0 prevents further connections to a server by starting a 
large number of TCP sessions which are then left uncompleted. The effort of the 
attacker is rather small, whereas the server quickly runs out of resources (which 
are allocated to the unfinished sessions). 

One countermeasure against connection depletion DoS attacks uses client puz- 
zles H3 When contacted by some unauthenticated, potentially malicious, client 
to execute some protocol and before allocating any resources for the execution, 
the server issues a client puzzle - a moderately hard computational problem. 
The server only engages in the execution of the protocol (and thus allocates 
resources) when the client returns a valid solution to the puzzle. The idea is 
that the server spends its resources only after the client has spent a significant 
amount of resources itself. To avoid the burden of running the above mechanism 
when no attackers are present, the defense only kicks in whenever the server 
resources drop below a certain threshold. 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 505 f523,| 2009. 
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Client puzzles have received a lot of attention in the cryptographic commu- 
nity |2I5I1 011 1 II 412 412 712 but most of the prior work consists of proposing puz- 
zle constructions and arguing that those constructions do indeed work. Although 
sometimes technical, such security arguments are with respect to intuitive se- 
curity notions for puzzles since rigorous formal models for the security of such 
puzzles are missing. The absence of such models has (at least) two undesirable 
consequences. On the one hand the investigation of puzzle constructions usually 
concentrates on some security aspects and omits others which are of equal im- 
portance when puzzles are used as part of other protocols. More importantly, 
the absence of formal models prevents a rigorous, reduction-based analysis of the 
effectiveness of puzzles against DoS in the style of modern cryptography (where 
the existence of a successful DoS attacker implies the existence of an attacker 
against client puzzles). 

In this paper we aim to solve the first problem outlined above as a first key step 
towards solving the second one. The main contribution of this paper is a formal 
framework for the design and analysis of client puzzles. In addition to fixing their 
formal syntax, we design security notions inspired by, and therefore tailored for, 
the use of client puzzles as a defense against DoS attacks. Specifically, we require 
that an adversary cannot produce valid puzzles on his own ( puzzle-unforgeability ) 
and that puzzles are non-trivial - the client needs indeed to spend at least a 
specified amount of resources to solve them - ( puzzle difficulty). The use of 
client puzzles that do not fulfill at least one of our notions immediately leads 
to a successful DoS attack. Our definitions use well-established intuition and 
techniques for defining one-wayness and authentication properties. Apart from 
some design decisions regarding the measure for resources and the precise oracles 
an adversary should have access to, there are no deep surprises here. However, 
we highlight that the lack of rigorous definitions such as those we put forward in 
this paper is dangerous. Constructions that are secure at an intuitive level, may 
be in fact insecure when used. Indeed, we explicitly demonstrate that a popular 
construction, that does not meet our notion of unforgeability, does not protect 
and in fact facilitates DOS attacks in systems that use it. 

Furthermore, we give a generic construction of a client puzzle that is secure in 
the sense we define. Many existing client puzzle constructions can be obtained 
as an instantiation of our generic construction, with only minor modifications if 
any. Our construction uses a pseudorandom function family to provide puzzle- 
unforgeability and puzzle-difficulty is obtained from a one-way function given a 
large part of the preimage. We prove our construction secure via an asymptotic 
reduction for unforgeability and a concrete reduction for difficulty. Next, we 
discuss our results in more details. 


Our Contribution 

Formal Syntax of a Client Puzzle. Our first contribution is a formal syn- 
tax for client puzzles. We define a client puzzle as a tuple of algorithms for sys- 
tem setup, puzzle generation, solution finding, puzzle authenticity checking, and 
solution checking. The definition is designed to capture the main functionality 
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required from client puzzles from the perspective of their use against DoS 
attacks. 

Security Notions for Client Puzzles. The use of puzzles against DoS 
attacks also inspired the two (orthogonal) security notions for client puzzles 
that we design. 

To avoid storing puzzles handed out to clients (a resource consuming task), 
the server gives puzzles away and expects the client to hand back both the puzzle 
and its solution. Obviously, the server needs to be sure the client cannot produce 
puzzles on its own, as this would lead to trivial attacks. We remark that this 
aspect is often overlooked in existing constructions since it is only apparent when 
puzzles are considered in the precise context for which they are intended. We 
capture this requirement via the notion of puzzle-unforgeability. Formally, we 
define a security game where the adversary is given certain querying capabilities 
(he can for example request to see puzzles and their solutions, can verify the 
authenticity of puzzles, etc) and aims to output a new puzzle which the server 
deems as valid. 

The second notion, puzzle-difficulty, reflects the idea that the client needs to 
spend a certain amount of resources to solve a puzzle. In our definition we took 
adversary resources to mean “clock cycles” , as this design decision allows us to 
abstract away undesirable details like the distributed nature of many DoS adver- 
saries. We define a security game where the adversary is given various querying 
capabilities sufficient for mimicking a DoS attack-like environment: he can see puz- 
zles and their solutions, obtain solutions for puzzles he chooses, etc. The challenge 
for the adversary is to solve a given challenge puzzle spending less than a certain 
number of clock cycles, with probability better than a certain threshold. 

An Attack on the Juels and Brainard Puzzles. Most of the previous 
work on puzzles concentrates exclusively on the difficulty aspect and overlooks, 
or only partially considers, the unforgeability property. One such work is the 
puzzle construction proposed by Juels and Brainard H3 We demonstrate the 
usefulness of our definitions by showing the Juels and Brainard construction 
is forgeable. We then explain how a system using this kind of puzzle can be 
attacked by exploiting the weakness we have identified. 

Generic Constructions. We provide a generic construction of a client puz- 
zle inspired by the Juels and Brainard sub-puzzle construction 0 First, we 
evaluate a pseudorandom function (PRF), keyed by some secret value, on inputs 
including a random nonce, hardness parameter and a system specific string. This 
stage ensures uniqueness of the puzzle and the desired unforgeability; only the 
server that possesses the hidden key is able to perform this operation and hence 
generate valid puzzles. The remaining information to complete the puzzle is then 
computed by evaluating a one way function (OWF), for which finding preimages 
has a given difficulty, on the output of the PRF; the goal in solving the puzzle is 
to find such a preimage given the inputs to the PRF and the target. The idea is 
that the client would need to do an exhaustive search on the possible preimage 
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space to find such a preimage. We certify the intuition by rigorous proofs that 
the generic construction meets the security definitions that we put forth, for ap- 
propriately chosen parameters. Importantly, many secure variants of previously 
proposed constructions can be obtained as instances of our generic construction. 
For example, the puzzle constructions proposed by Juels and Brainard m puz- 
zle and the two-party variant of the Waters et al. puzzles m can be seen as 
variants of our generic construction. Finally, we provide concrete security bounds 
for the first of these puzzles. We do so in the random oracle model which we use 
to obtain secure and efficient instantiations of the two primitives used by our 
generic construction. 

Related Work 

Merkle Puzzles. The use of puzzles in cryptography was pioneered by Merkle 
HE! who used puzzles to establish a secret key between parties over an inse- 
cure channel. Since then the optimality of Merkle puzzles has been analyzed by 
Impagliazzo and Rudich m and Barak and Mahmoody-Ghidary jEj. The pos- 
sibility of basing weak public key cryptography on one-way functions, or some 
variant of them was recently explored by Biham, Goren and Ishai HI- Specifi- 
cally, a variant of Merkle’s protocol is suggested whose security is based on the 
one-wayness of the underlying primitive. 

Client Puzzles. Client puzzles were first introduced as a defense mechanism 
against DoS attacks by Juels and Brainard d The construction they proposed 
uses hash function inversion as the source of puzzle-difficulty. They also attempt 
to obtain puzzle-unforgeability but partially fail in two respects. By neglecting 
the details of how puzzles are to be used against a DoS attack, the construction 
suffers from a flaw (which we explain how to exploit later in this paper) that 
can be used to mount a DoS attack. Secondly, despite intuitive claims that secu- 
rity is based on the one-wayness of the hash function used in the construction, 
security requires much stronger assumptions, namely one-wayness with partial 
information about the preimage. The authors also present a method to combine 
a key agreement or authentication protocol with a client puzzle, and present a 
set of informal desirable properties of puzzles. Building on this work, Aura et 
al. j2| use the same client puzzle protocol construction but present a new client 
puzzle mechanism, also based on hash function inversion, and extend the set of 
desirable properties. 

An alternative method for constructing client puzzles and client puzzle proto- 
cols was proposed by Waters et al. ES|. This technique assumes the client puzzle 
protocol is a three party protocol and constructs a client puzzle based on the dis- 
crete logarithm problem for which authenticity and correctness can be verified us- 
ing a Diffie-Hellman based technique. One of the main advantages of this approach 
is that puzzle generation can be outsourced from the server to another external 
bastion, yet verification of solutions can be performed by the server itself. 

More recently Tritilanunt et al. m proposed a client puzzle based on the 
subset sum problem. Schaller et al. m have also used what they refer to as 
cryptographic puzzles for broadcast authentication in networks. 
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An interesting line of work analyzes ways to construct stronger puzzles out of 
weaker ones. The concept of chaining together client puzzles to produce a new 
and more difficult client puzzle was introduced by Groza and Petrica na- Their 
construction enforces a sequential solving strategy, and thus yields a harder puz- 
zle. A related work is that of Canetti, Halevi, and Steiner jjJJ who are concerned 
with relating the difficulty of solving one single puzzle to that of solving several 
independent puzzles. They consider the case of “weakly” verifiable puzzles (puz- 
zles for which the solution can only be checked by the party that produced the 
puzzles). That paper does not consider the use of puzzles in the context of DoS 
attacks, and thus is not concerned with authenticity. 

Client Puzzle Protocols. In an interesting paper that analyzes resistance of 
client puzzle protocols to man-in-the-middle attacks EH, Price concludes that in 
any secure protocol the server needs to resort to digital signatures. We note that 
such concerns are related but orthogonal to the goals that we pursue in this pa- 
per. Indeed, in prior literature there is no clear distinction between client puzzles 
(the problems that the server hands out for clients to solve) and client puzzles 
protocols (the ensemble that includes, in addition to the particular puzzles that 
are constructed, the way state is maintained by the server, the mechanism for 
deeming a puzzle as expired etc.) We emphasize that in this paper we are mainly 
concerned with the former so the results of EH do not apply. 

DoS Attacks. A classification of remote DoS attacks, countermeasures and a 
brief consideration of Distributed Denial of Service (DDoS) attacks were given 
by Karig and Lee EH- Following this Specht and Lee EH! give a classification 
of DDoS attacks, tools and countermeasures. In EH the adversarial model of 
EH is extended to include Internet Relay Chat (IRC) based models. The au- 
thors of (2E1 also classify the types of software used for such attacks and the 
most common known countermeasures. Other classifications of DDoS attacks 
and countermeasures were later given by urn- 

A number of protocols have been designed to resist DDoS attacks. The most 
important examples are the JFK protocol P and the HIP protocol [213] ■ The 
JFK protocol of Aiello et al. P trades the forward secrecy property, known as 
adaptive forward secrecy, for denial of service resistance. The original protocol 
does not use client puzzles. In j2H| the cost based technique of Meadows jldi!7j 
is used to analyze the JFK protocol. Two denial of service attacks are found and 
both can be prevented by introducing a client puzzle into the JFK protocol. 

Spam and Time-Lock Crypto. Other proposals for the use of puzzles include 
the work of Dwork and Naor who propose to use a pricing function (a particular 
type of puzzles) to combat junk email [B|. The basic principle is the pricing func- 
tion costs a given amount of computation to compute and this computation can 
be verified cheaply without any additional information. A service provider could 
then issue a “stamp duty” on bulk mailings. Finally, Rivest et al. introduced the 
notion of timed-release crypto in EH and instantiate this notion with a time-lock 
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puzzle. The overall goal of timed-release crypto is to encrypt a message such that 
nobody, even the sender, can decrypt it before a given length of time has passed. 

Paper Overview. We start with a sample client puzzle from Section El Our 
formal definition of a client puzzle and a client puzzle protocol is in Sectional 
In Section 01 we give security notions for client puzzles in terms of unforgeability 
and difficulty. We demonstrate that the Juels and Brainard client puzzles is 
insecure in Section 0 Finally, our generic construction of a client puzzle is given 
in Section 0 We also include a sample instantiation based on hash functions 
which we analyze in the random oracle model. 


2 Juels and Brainard Puzzles 

To illustrate some of the basic ideas behind the construction of puzzles, we 
first give a brief description of the puzzle generation process for the Juels and 
Brainard construction H3 In our description we refer to the (authorized) puzzle 
generation entity (or user) as the generator and the (authorized) puzzle solving 
entity (or user) as the solver. We use the term “puzzle” from here onwards 
for individual puzzle instances. We write {0,1}* for the set of binary strings of 
length t and {0, 1}* for the set of binary strings of arbitrary finite length. If 
x = xq, *x, . . . ,Xi , . . . , Xj , . . . , x n is a bit string then we let x(i, j) denote the 
sub string Xi, ... .Xj. 

For this construction the generator (generally some server) holds a long term 
secret value s chosen uniformly from a space large enough to prevent exhaustive 
key search attacks. The server also chooses a hardness parameter: a pair Q = 
( a , fi) G N 1 2 which ensure puzzles have a certain amount of difficulty to solve. 
We let H : {0, 1}* {0, l} m be some hash function. To generate a new puzzle 

the generator performs the following steps to compute the required sub-puzzle 
instances Pj for j G {1, 2, . . . , /?}: 

• A bit string oj is computed as <jj = H(s, str, j). The value str has the struc- 
ture str = t\\M for t some server time valueQ and M some unique value0. We 
denote Xj = crj{ 1, a) and Zj = Oj(a + 1, m). 

• A value yj is computed as yj = H(aj) and the sub-puzzle instance is Pj = 
( Z P Vi)- 

The full puzzle instance is then the required parameters plus the tuple of sub- 
puzzle instances puz = (<5,str, P = (Pi, P 2 , . . . , Pp))- The sub-puzzle instance 
generation process is summarized in Figure 0 

A solution to a given sub-puzzle Pj is any string x'j such that H ( x'j 1 1 Zj) = yj. 
The solution to the full puzzle instance is a tuple of solutions to the sub-puzzles. 
To verify a potential solution soln = ( Q , str, solrq , • • • , solryj) the generator verifies 

1 The details of the type of value this is are not described in E3| but here we will 
assume this is as a bit string. 

2 In EH this is specified as the first message flow of a protocol or some other unique 
data. Again, we will assume this is encoded as a bit string since this is not specified. 
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Generation Parameters: s,str ,j 



Fig. 1. The Juels and Brainard Sub-Puzzle Instance Generation 


each Pj and sol by checking that H (sol nj || Zj) = yj for each j. The authentic- 
ity of a given puzzle is checked by regenerating each Pj using s and comparing 
this to the puzzle submitted. 

To incorporate this client puzzle into a client puzzle protocol the server, on 
receiving a valid solution, allocates buffer slots, by using a hash table on the 
values of M, for each puzzle and correct solution submitted. This ensures that 
only one puzzle instance and solution are accepted for a given value of M. 

3 Client Puzzles 

The role of a client puzzle in a protocol is to give one party some assurance that 
the other party has spent at least a given amount of effort computing a solution 
to a given puzzle instance. In this section we give a formal definition of a client 
puzzle in the most general sense. 

Formal Syntax of A Client Puzzle. A client puzzle is a tuple of algorithms: 
a setup algorithm for generating long term public and private parameters, an 
algorithm for generating puzzle instances of a given difficulty, a solution finding 
algorithm, an algorithm for verifying authenticity of a puzzle instance and an 
algorithm for verifying correctness of puzzle and solution pairs. We formally 
define a client puzzle as follows. 

Definition 1 (Client Puzzle). A client puzzle CPuz = (Setup, GenPuz, 
FindSoln, VerAuth, VerSoln) is given by the following algorithms: 

• Setup is a p.p.t. setup algorithm. On input of l k , for security parameter k, 
it performs the following operations: 

• Selects the long term secret key space sSpace, hardness space QSpace, 
string space strSpace, puzzle instance space puzSpace and solution space 

solnSpace. 

• Selects the long term puzzle generation key s <— sSpace. 

• Sets II additional public information, such as some description of algo- 
rithms required for the client puzzle. 

• Sets params<— (sSpace, puzSpace, solnSpace, QSpace, II) and outputs 
(params, s ). 
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The tuple params is the public system parameters and as such is not explicitly 
given as an input to other algorithms. The value s is kept private by the puzzle 
generator. 

• GenPuz is a p.p.t. puzzle generation algorithm. On input of s £ sSpace, 
Q £ QSpace and str £ strSpace it outputs a puzzle instance puz £ puzSpace. 

• FindSoln is a probabilistic solution finding algorithm. On input of puz £ 
puzSpace and a run time r £ N it outputs a potential solution soln £ 
sol nS pace after at most t clock cycles of execution. 

• VerAuth is a d.p.t. puzzle authenticity verification algorithm. On input of 
s £ sSpace and puz £ puzSpace this outputs true or false. 

• VerSoln is a deterministic solution verification algorithm. On input of puz £ 
puzSpace and a potential solution soln £ solnSpace this outputs true or false. 

For correctness we require that if (params, s)«— Setup(l fc ) and puz<— GenPuz(s, Q, 
str), for Q £ QSpace and str £ strSpace, then 

• VerAuth (s, puz) = true and 

• It £ N such that solrn— FindSoln(puz, t) and VerSoln(puz, soln) = true. 

Remark 1 . Typically client puzzles use a set of system parameters, most notably 
system time, as input to the puzzle generation algorithm. This is so the server 
has a mechanism for expiring puzzles handed out to clients. In our model we use 
str to capture this input and do not enforce any particular structure on it. 

Remark 2. To prevent DoS attacks that exhaust the server memory it is desir- 
able that the server stores as little state as possible for uncompleted protocol runs 
(i.e. before a puzzle has been solved). We refer to this concern of client puzzles 
as “state storage costs” j2|. We build this into our definition of a client puzzle by 
insisting that only a single value, s, is stored by a server; all the data necessary 
to solve a given puzzle and to re-generate, and hence verify authenticity of a 
puzzle and solution pair, is included in the puzzle description puz. 

Remark 3. Generally, for a puzzle to be “secure” when used within a client 
puzzle protocol, we want puzzles generated to be unique and for puzzle and 
solution pairs to only be validly used once by a client. In actual usage, a server can 
filter out resubmitted correctly solved puzzle and solution pairs by, for example, 
using a hash table mechanism. Uniqueness of puzzles can be ensured by having 
GenPuz select a random nonce ns and use this in the puzzle generation. 

Remark 4. Our definition assumes private verifiability for VerAuth. Generally 
the only party concerned with checking who generated a given puzzle is the 
puzzle generator (client puzzles are used before any other transactions take place 
and to protect the generator and no other party). Although in some cases it may 
be useful to have publicly verifiable puzzles it would complicate the definition 
and we choose to keep our definition practical yet as simple as possible. 
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4 Security Notions for Client Puzzles 

We define two notions for client puzzles. The first measures the ability of an 
adversary to produce a correctly authenticating puzzle with an unknown private 
key. We refer to this as the ability of an adversary to forge a client puzzle. The 
second notion gives a measure of the likelihood of an adversary finding a solution 
to a given puzzle within a given number of clock cycles of execution. We refer 
to this as the difficulty of a client puzzle. Intuitively, these are both what one 
would expect to require from a client puzzle given its role in defenses against 
DoS attacks; being able to either forge puzzles or solve them faster than expected 
allows an adversary to mount a DoS attack. 

We first review the definition of a function family since we will use function 
families to express security of a given client puzzle in terms of difficulty. A 
function family is a map F : I X D i— > R. The set I is the set of all possible 
indices, D the domain and R the range. Unless otherwise specified we assume 
/ = N. The set R is finite and all sets are nonempty. We write F t : D >—> R for 
Fi(d) = F(i. d) where i £ I and refer to Fj as an instance of F. 

Unforgeability of Puzzles. We first define our notion of unforgeability of 
client puzzles. Intuitively, we require an adversary that sees puzzles generated 
by the server (possibly together with their associated solutions), and that can 
verify the authenticity of any puzzle it chooses, cannot produce a valid looking 
puzzle on his own. 

To formalize unforgeability of a client puzzle we use the following game 
Exec^cPu z (k) between a challenger C and an adversary A. 

(1) The challenger C first runs Setup on input l k to obtain (params, s). The 
tuple params is given to A and s is kept secret by C. 

(2) The adversary A gets to make as many CreatePuz(Q, str) and CheckPuz(puz) 
queries as it likes which C answers as follows. 

• CreatePuz(Q, str) queries. A new puzzle is generated puz*-GenPuz(s, Q, 
str) and output to A. 

• CheckPuz(puz) queries. If VerAuth(s, puz) = true and puz was not out- 
put by C in response to a CreatePuz query then C terminates the game 
setting the output to 1. Otherwise false is returned to A. 

(3) If C does not terminate the game in response to a Check query then even- 
tually A terminates and the output of the game is set to 0. 

We say the adversary A wins if Exec^ F CPuz (fc) = 1 and loses otherwise. We define 
the advantage of such an adversary as 

Adv^cPuz(fc) = Pr[ExecXcPuz(fc) = *]• 
Puzzle-unforgeability then means that no efficient adversary can win the above 
game with non-negligible probability. 
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Definition 2 (Puzzle-unforgeability). A client puzzle CPuz is UF secure if 
for any p.p.t. adversary A its advantage Adv^, F C p uz (fc) is a negligible function 
of k. 

Remark 1 . In the game Exec^ F CPuz (k) we allow A access to all algorithms defined 
in CPuz. In particular, we allow unlimited access to the GenPuz algorithm for any 
given chosen inputs. This allows A to generate as many puzzles as it wishes (since 
A is p.p.t. it will anyway generate at most polynomially many) with any given 
chosen key and difficulty values. Notice that the adversary can find solutions to 
any puzzle by running the FindSoln algorithm which is public. These abilities 
are sufficient to mimic the environment in which a DoS attacker would sit. 

Difficulty of Solving Puzzles. We formalize the idea that a puzzle CPuz can- 
not be solved trivially via the game Exec®’cp FF (&) between a challenger C and an 
adversary A. The game is defined for each hardness parameter Q e N as follows: 

(1) The challenger C runs Setup on input l k to obtain (params, s) and passes 
params to A. 

(2) The adversary A is allowed to make any number of CreatePuzSoln(str) 
queries throughout the game. In response to each such query C generates 
a new puzzle as puz<— GenPuz(s, Q, str) and finds a solution soln such that 
VerSoln(puz, soln) = true. The pair (puz, soln) is then output to A. 

(3) At any point during the execution A is allowed to make a single Test (str'') 
query. The challenger then generates a challenge puzzle as puzf <— GenPuz(s, 
Q , strf ) which it returns to A. 

Adversary A terminates its execution by outputting a potential solution soln ' . 
We define the running time t of A as being the running time of all of the 
experiment Exec^'^p FF (/c). 

We say the adversary wins Exec® CPuz(^) if VerSoln(puzf, soln*) = true. In 
this case we set the output of Exec®’^p FF (fc) to be 1 and otherwise to 0. We then 
define the success of an adversary A against CPuz as 

Succ®;^ p FF (/c) = Pr[Exec®;^ FF (fc) = l] . 

We define the difficulty of puzzle solving by requiring that for any puzzle hardness 
the success of any adversary that runs in a bounded number of steps falls below 
a certain threshold (that is related to the hardness of the puzzle). 

Definition 3 (Puzzle-difficulty). Let e : N 2 i-> (N [0,1]) be a family of 
monotonically increasing functions. We use the notation £k,C}(-) for the function 
within this family corresponding to security parameter k and hardness parameter 
Q. We say a client puzzle CPuz is e:(-)— DIFF if for all r 6 N, for all adversaries 
A in Execj^p® (k), for all security parameters k e N, and for all Q G N it holds 
that 

Succ®^p F z (fc) < £k,Q(r) 

where A T is the adversary A restricted to at most t clock cycles of execution. 
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Remark 1. The security game above allows A to obtain many puzzle and so- 
lution pairs by making queries to model actual usage in DoS settings; when 
a client puzzle is used as part of a client puzzle protocol an adversary may see 
many such puzzles and solutions exchanged between a given generator and solver 
on a network. The adversary could then learn something from these. 

Remark 2. In the definition of the Test query, we do allow the string str' to 
be one previously submitted as a CreatePuzSoln query and allow CreatePuzSoln 
queries on any string including std after the Test query. It then immediately 
follows that a difficult puzzle needs to be such that each puzzle generated is 
unique. Otherwise, a previously obtained solution through the CreatePuzSoln 
query may serve as a solution to the challenge query. Furthermore, it also follows 
that solutions to some puzzles should not be related to the solutions of other 
puzzles, as otherwise a generalization of the above attack would work. 

Remark 3. The queries CreatePuz (used in the game for puzzle-unforgeability) 
and CreatePuzSoln used in the above game are related, but different. The query 
CreatePuzSoln outputs a puzzle together with its solution. The second is more 
subtle: in a CreatePuz query we allow A to specify the value of Q used but in 
CreatePuzSoln we do not (the value of Q is fixed throughout the difficulty game). 

Remark 4. Clearly any puzzle that is e(-)-DIFF is also (e(-) + p)-DIFF where 
/x e M>o is such that e(t)+/x < 1 (since Succ^ D |- F p F uz (fc) < £fe,g(r) < £k,Q(r)+n)- 
The most accurate measure of difficulty for a given puzzle CPuz is then the 
function e ( t ') = infy^ Succ^ D |- F p F uz (fc). 

Remark 5. Since we measure the running time of the adversary in clock cycles, 
the model abstracts away the possibility that the adversary may be distributed 
and thus facilitates further analysis (for example of the effectiveness of client 
puzzle defense against DoS attacks). 


5 An Attack on the Juels and Brainard Puzzles 

In this section we describe an attack on the Juels and Brainard m client puz- 
zle mechanism as described in Section El The attack works because puzzles are 
forgeable, which is due to a crucial weakness in puzzle generation; each set of 
generation parameters defines a family of puzzles each with a different hardness 
value. Finally we construct a DDoS attack on servers using certain client puzzle 
protocols based on this construction. This attack clearly demonstrates the ap- 
plicability of our definitions and how they can be used to find problems with a 
given client puzzle construction. 

Proving Forgeability. The reason the construction is forgeable is the authen- 
tication is not unique to a given instance but covers a number of instances of vary- 
ing difficulty. This occurs because the puzzle instance difficulty is not included in 
the first preimage of the sub-puzzle construction. We exploit this weakness and 
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construct an adversary A with Adv^ F CPuz (fc) = 1. We have the following lemma 
regarding the forgeability of the Juels and Brainard construction. 

Lemma 1. The client puzzle construction of Juels and Brainard VDl is not UF 
secure. 

Proof. To prove this we construct an adversary A against the UF security of the 
construction that can win the security game Exec^ F CPuz (fc) with probability 1. 
We now describe the details of A. 

At the start of the security game A is given a set of public parameters. The 
adversary then makes a query CreatePuz(Q, str) for some random choices of Q 
and str where Q = (a, fJ) and receives a puzzle instance puz = (Q,str, P = 
(Pi, P2, • ■ • , Pd)) in response. Next A removes the first bit of each Pi to obtain 
P} and constructs Qfi = (a + 1 ,/3) and puz^ = (Qt , str, pt = (pj, P;J, . . . , Fj)). 
The adversary then makes a query CheckPuz(puz' ). 

Clearly A wins with probability 1 since puz and puz' are both generated from 
the same s and str hence puz' will correctly verify yet was not output from a 
CreatePuz query. □ 

Remark 1. One could also prove Lemma [Q by having the adversary construct 
the forgery as = (a,(3 — 1) and then puz^ = (Qt,str, (P-j, . . . , P^_i))- One 
could also vary the number of bits moved between a and each Pi or change the 
number of sub-puzzles deleted. The reason we choose to give the proof in the 
manner given is because this specific method allows for the construction of a 
DDoS attack with the given assumptions we make about the protocol using this 
particular client puzzle. We describe this attack next. 

Constructing a DDoS Attack. We now use the forgeability of the con- 
struction to mount a DDoS attack on client puzzle protocols based on this client 
puzzles. The attack works when the difficulty parameter is increased in a certain 
way and when the hash table, mentioned in m and used to prevent multiple 
puzzle instance and solution submissions, is based on some unique data for each 
instance that is not in the preimage of any sub-puzzle. A hash table mechanism 
that depends on some unique data contained in each sub-puzzle preimage, as 
is mentioned in El. would thwart the following DDoS attack on client puzzle 
protocols based on this client puzzle. 

We first assume the client puzzle is used in the client puzzle protocol of na 
and the generator increases Q by increasing a many times for each increase in 
/?. We also assume any hash tables used are computed using either the puzzle 
instance alone or the correct solutions alone. 

To mount the DDoS attack the adversary commands each of its zombies (plat- 
forms the adversary controls) to start a run of the protocol with the server under 
attack. The server will begin to issue puzzle instances and then, when enough 
requests are received, will increase Q by incrementing a. Each zombie computes 
a solution to the first puzzle it receives and to submits this to the server. Then, 
while this puzzle has not expired, each time a is incremented, a new puzzle and 
solution pair is trivially computed by removing the first bit from each x. L and 
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concatenating this to the end of each solrij previously computed. The new puz- 
zle and solution pair are submitted to the server and will correctly verify and 
will then be allocated buffer space (due to our assumptions on the hash table 
mechanism). When a zombies’ puzzle expires it obtains a new one. As the value 
Q is increased then so will the puzzle expiry period and hence more forged puz- 
zles can be used per valid puzzle obtained eventually exhausting the memory 
resources of the server. 

Also, even if we assume that the buffer allocation based on the hash table 
mechanism is as in m the attack will still consume a huge amount of server 
computational resources. This is because the adversary can trivially spoof new 
puzzle instances and solutions from previous ones. These will not be allocated 
buffer space due to the hash table mechanism, but will consume computation 
via server verification computations. In the next section we give a an example 
instantiation of a generic construction that is a repaired version of the sub-puzzle 
mechanism; an unforgeable version of the sub-puzzle construction. 

6 A Generic Client Puzzle Construction 

In this section we provide a generic construction for a client puzzle which also 
repairs the flaw identified in the previous section with respect to the Juels and 
Brainard puzzle. Our construction is based on a pseudorandom function (PRF) 
and a one way function (OWF). We prove our generic construction is secure 
according to the definitions we put forth in this paper, and show one possible 
instantiation. Intuitively, the unforgeability of puzzles is ensured by the use 
of the PRF and the difficulty of solving puzzles is ensured by the hardness of 
inverting the one-way function. We first review some notational conventions and 
definitions regarding function families, pseudorandom functions, and concrete 
notions for pseudorandom function families and one way function families. 

If F is a function family then we use the notation / F for i «— J; /«— Fj. 
We denote the set of all possible functions mapping elements of D to R by 
Func(Z), R). A random function from D to R is then a function selected uniformly 
at random from Func(D, R). 

Pseudorandomness. We define the PRF game Execg R g 6 (fc) for an adversary B 
against the function family F : K. x D i— > R, where /C = 2 k , as follows. 

(1) For 6=1 the adversary B has black box access to a truly random function 
1Z from the set Func(H, R) and for 6 = 0 the adversary B has black box 
access to a function F s chosen at random from F. 

(2) The adversary B is allowed to ask as many queries as it wants to whichever 
function it has black box access to. Eventually B terminates outputting a 
bit 6*. 

We set the output of Execg R g b (k) to 1 if 6* = 6 and set the output to 0 otherwise. 
We then define the advantage of an adversary against F in terms of PRF as 

Advg R g(fc) = |Pr[Execg R g °(A:) = l] -PrfExec P B R p\k) = l]|. 
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Concrete Pseudorandom and One Way Function Families. Here we 
briefly review concrete notions of security for pseudorandom function and one 
way function families. We depart from the typical “( e,t )- hardness” style of 
definitions, as they are not sufficient for our purposes. Instead we view e, the 
probability of a break, as a function of the running time r of the adversary. So, 
a primitive is £(•)- secure if for all adversaries running in time t the probability 
of breaking the primitive is at most e(r). 

Definition 4 (i'fc(-) - PRFF). Let F : JC x D i— > R be a function family and 
v : N i— > (N i— » [0,1]) be a family of monotonically increasing functions. We 
say F is a ^(-)-PRFF if for all k £ K. and for all adversaries A it holds that 
Adv^ l ) F (A:) < v k {r). 

Note that, in the definition of an ^(-)-PRFF, the security parameter k specifies 
the size of the keyspace for the game Exec^p(k) and the actual key, and hence 
function from the family used, is chosen at random from this keyspace. 

Definition 5 fyfy)— OWF). For an adversary A we define its advantage against 
a function if : X > y, where X is fixed and finite, in terms of OWF as 

Adv °> p = Pr [s ^ (x^A(y) A if{x) = y)}. 

Let £i : N i— > [0, 1] be a monotonically increasing function. Then, the function if 
is an £j(-)-OWF if for all adversaries A it holds that Adv°^ < £,(r). 

We then extend this definition to a family of functions as follows: 

Definition 6 (e(-)-OWFF). Let : N ^ (X >-> y) and e : Nh(Nh [0,1]) 
be function families. We say is an £(-)-OWFF if for all i £ N the function 
iPi-.X^y is an £j(-)-OWF. 

The Generic Construction. Our generic construction is based on the method 
of Juels and Brainard m- Most chent puzzle constructions based on one way 
functions, such as the discrete log based scheme of [2%] . and the RSA based 
scheme of [10], can be described in this manner with some minor modifications. 
So, our generic construction pins down sufficient assumptions on the build- 
ing blocks that imply security of the resulting puzzle. We let k e N then let 
F : JC X D t — > X where \X\ > |/C| = 2 fc be a function family indexed by elements 
of /C. The domain D of F s is 3-tuples of the form N x {0, 1}* x {0, l} k e {0, 1}*. 
We write F s •)) when we want to specify the exact encoding of an element 
of D explicitly as an input to F s . We further let ip : N i— ► (A i— > y) be a family 
of functions indexed by Q. We assume there is a polynomial time algorithm to 
compute ipQ for each value of Q and input. The various algorithms in the scheme 
are then as follows: 

Setup(l fc ). The various spaces are chosen; sSpac e<— /C, QSpace*— N, strSpace^ 
{0, 1}*, solnSpace<— X and puzSpace QSpace x strSpace x {0, l} fc x y. The 
parameter n is assigned to be the polynomial time algorithm to compute ipQ 
for all Q e QSpace and x G X. Finally, the value s is chosen as s +— sSpace 
and the tuple params constructed then output. 
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EED 


Fig. 2. Solid arrows are actions performed by a generator and dashed ones by a solver. 
The lists of algorithms above/below arrows imply the actions are performed as part 
of these algorithms. The details of how each action is used in the given algorithm are 
given in the full description. 


GenPuz(s, Q, str). A nonce is selected ns +— {0,l} fc . Next x is computed as 
x<—F s (Q, str, ns)- The value y £ y is computed as y<—(pQ(x) and the puzzle 
assigned to be puz = (Q, str, ns, y) and output. 

FindSoln(puz, r). While this algorithm is within the allowed number of clock 
cycles of execution it randomly samples elements from the set of possible 
solutions without replacement and for each potential preimage x' £ X com- 
putes y'^- vq(x'). If y' = y this outputs x' then halts and otherwise continues 
with random sampling. If this algorithm reaches the last clock cycle of execu- 
tion then it outputs a random element of the remaining unsampled preimage 
space. The set of possible solutions is generally a subset of X that is defined 
by the value y of size dependent upon Q in some manner; the details of how 
the size varies depends upon the function family ip. 

VerAuth(s, puz'). For a puzzle puz' = {Q' , str' ,n' s ,y') this computes x' as x'<— 
F S (Q', str', n' s ) then y<—<PQ(x'). If y' = y this outputs true and otherwise 
outputs false. 

VerSoln(puz / , soln'). Given a potential solution soln' = x' this checks if Pq(x') = 
y and if so outputs true and otherwise outputs false. 

We use the notation CPuz = PROWF(F; ip) for the generic construction in 
this manner. The construction is summarized in Figure [21 

Remark 1. In the definition of an e(-)-OWF we specify the domain X is fixed 
and finite but do not specify the exact size or shape of this; in our generic 
construction this is set to be the output space of some PRF. 

Remark 2. The exact specification of the FindSoln algorithm is not important 
for our theorems and proofs, nor is it unique. Indeed, other techniques such as 
exhaustive search may even be faster than the algorithm given. The important 
point is such an algorithm exists and can be described. 

Remark 3. The domain D of F is given as 3 tuples of the form N x {0, 1}* X 
{0, l} fc which is the same as {0, 1}*. However, we will always construct elements 
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of D from a given tuple rather than taking a bit string and encoding it as an 
element of D. Hence we do not refer to this as a uniquely recoverable encoding 
on D. 

Remark 4. In reality the variable ns need not be sampled at random; it just has 
to be a nonce and could be instantiated with, for example, a counter. We specify 
uniform sampling from the domain of ns since it makes our proofs simpler and 
easier to follow. 

Remark 5. Our generic construction is similar to the Juels and Brainard scheme 
m but avoids the forgeability problems by including the hardness parameter Q 
in the input to F. 

Remark 6. Finally, we remark that the generic construction where the PRF 
function is replaced by a MAC is not necessarily secure. Indeed, one-wayness of 
the generic construction is guaranteed as long as the one-way function is applied 
to randomly chosen bit-string. While this property is ensured through the use of 
a pseudo-random function, it is does not always hold for a MAC. For example, 
the combination of an artificial MAC function for which the first half of the 
output bits are constant with the OWF that discards the first half of its input 
is clearly an insecure puzzle construction. 

The following theorems capture the unforgeability and the level of hardness 
enjoyed by our generic construction. Their proofs can be found in the full version 
of the paper. 

Theorem 1. Let F be a PRF family and ip a family of functions as described 
above such that for each value of Q and for all y G y we have |v , q 1 (j/) |/|A| < 
l/2 fc , where k is the security parameter. Then the client puzzle defined by CPuz = 
PROWF(F; p) is UF secure. 

To understand the role of the condition that I^q 1 (?/) |/|< ; k’| < l/2 fc consider the 
(extreme) case when F has a small constant number of images, that each cor- 
responds to roughly the same number of possible inputs to F. Notice that this 
condition does not contradict the pseudorandomness of F, but such a function 
is not sufficient to ensure unforgeability. Indeed, an attacker can select a ran- 
dom y £ y, obtain some x such that <p(x) = y and select some random triple 
( Q , str, ns) as the solution to the puzzle. With probability about half, the image 
of ( Q , str, ns) is x. The adversary can therefore produce solved puzzles that are 
valid without interacting with the server. 

Theorem 2. Let F be a z/(-)-PRFF family for the function family z'(-), ip an 
s(-)-OWFF for the function family e(-) and CPuz = PROWF(F; y>). Then the 
client puzzle PRO\NF(F;ip) is 7(-)-DIFF where 

lk,Q{T) = 2 • Vk{r + t°) + (l + r/(2 k - r)) • £q(t + r x ) 
and t°, t 1 G N are some constants. 

An adversary may try to solve puzzles by either computing the value F S (Q. str, 
ns) for an unknown value of s or by computing a preimage of i pq for the value 
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y provided. The function Vk in Theorem 0 captures that computing F s for an 
unknown value of s should not be easy; the function F needs to be a good PRF. 
Intuitively, k should be chosen to be large enough that it is easier to compute a 
preimage of y under ipQ than computing the corresponding value F S (Q, str, ns). 

Impact on Practical Implementations of Puzzles 

Some of the most popular proposals of puzzles are based on hash functions. 
In this section we instantiate our generic construction from the previous section 
using hash functions to construct the needed PRF and one-way function families. 
We obtain essentially a modified Juels and Brainard scheme that incorporates 
the defence against the attack that we present in Section 0 The security analysis 
is in the random oracle model. 

Given a hash function H : { 0, 1}* t— > {0, l} m a standard construction for a PRF 
family F is as follows. Key generation selects a random string s <— {0, l} fe where 
k is the security parameter. Function application is defined by F s (x) = H(.s\\x) 
for any x £ {0, 1}* Furthermore, given a hash function G : {0, 1}* — > {0, l} 71 we 
define the function family < p of functions <Pq : {0, l} m — > {0, x {0, 1}” by 

<Pq(x) = (x{Q + 1, m), G(x)). In the full version of the paper we prove that in 
the random oracle model, F is a z/fc(-)-PRFF function, for some function family 
v with Vk(r) < p- and that ip is e(-)-OWFF for some function family e with 
s(r) < r/2 m + r/(2 m-< 2). Concrete bounds for the security of our construction 
follow by instantiating the bounds in Theorems Q and |3 
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Abstract. Non-malleability is an interesting and useful property which 
ensures that a cryptographic protocol preserves the independence of the 
underlying values: given for example an encryption £{m) of some un- 
known message m, it should be hard to transform this ciphertext into 
some encryption £(m*) of a related message m* . This notion has been 
studied extensively for primitives like encryption, commitments and zero- 
knowledge. Non-malleability of one-way functions and hash functions has 
surfaced as a crucial property in several recent results, but it has not un- 
dergone a comprehensive treatment so far. In this paper we initiate the 
study of such non-malleable functions. We start with the design of an 
appropriate security definition. We then show that non-malleability for 
hash and one-way functions can be achieved, via a theoretical construc- 
tion that uses perfectly one-way hash functions and simulation-sound 
non-interactive zero-knowledge proofs of knowledge (NIZKPoK). We also 
discuss the complexity of non-malleable hash and one-way functions. 
Specifically, we show that such functions imply perfect one-wayness and 
we give a black-box based separation of non-malleable functions from 
one-way permutations (which our construction bypasses due to the “non- 
black-box” NIZKPoK based on trapdoor permutations). We exemplify 
the usefulness of our definition in cryptographic applications by show- 
ing that (some variant of) non-malleability is necessary and sufficient 
to securely replace one of the two random oracles in the IND-CCA en- 
cryption scheme by Bellare and Rogaway, and to improve the security of 
client-server puzzles. 

1 Introduction 

Motivation. Informally, non- malleability of some function / is a cryptographic 
property that asks that learning f(x) for some x does not facilitate the task of 
generating some /( x*) so that a;* is related to x in some non- trivial way. This 
notion is especially useful when / is used to build higher-level multi-user pro- 
tocols where non-malleability of the protocol itself is crucial (e.g., for voting or 
auctioning). Non-malleability has been rather extensively studied for some cryp- 
tographic primitives. For example, both definitions as well as constructions from 
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standard cryptographic assumptions are known for encryption, commitments 
and zero-knowledge J 1 71512! ill (il21lll 411 II 512712 8l2| . Non-malleability in the case 
of other primitives, notably for one-way functions and for hash functions 0 has 
only recently surfaced as a crucial property in several works [7IKI1 I II ilj . which 
we discuss below. 

For instance, plenty of cryptographic schemes are only proved secure in the 
random oracle (RO) model where one assumes that a hash function behaves 
as a truly random function to which every party has access to. It is well-known 
that such proofs do not strictly guarantee security for instantiations with hash 
functions whose only design principles are based on one-wayness and/or collision- 
resistance, because random functions posses multiple properties the proofs may 
rely on. Hiding all partial information about pre-images, i.e. perfect one-wayness, 
is one of these properties, and has been studied in 0D3- Non-malleability is 
another example of such a property. 

An illustrative example is the encryption scheme of Bellare and Rogaway 0, 
where a ciphertext of message M has the form ( f(r),G(r ) ® M, H(r, M)) for 
a trapdoor permutation /, hash functions G , H and random r. The scheme is 
known to be IND-CCA secure in the random oracle model. However, an instan- 
tiation of H with a malleable function for which given H (r, M ) it is possible to 
compute H(r, M ® M'), for some fixed M' known to the attacker, renders the 
scheme insecure: the attacker can recover M by submitting to the decryption 
oracle the valid ciphertext (/(r), G(r) ® M ® M', H(r, M ® M')). 

It was shown in jZj that a similar attack can be carried out against the popular 
OAEP encryption scheme whenever the instantiation of the underlying hash 
function is malleable. A subsequent work jH| showed that some form of non- 
malleability permits positive results about security of an alleviated version of 
the OAEP scheme in the standard model. However, it remains unclear if the 
approach to non-malleability in jSj expands beyond the OAEP example, and the 
work left open the construction of non-malleable primitives. 

Another motivating example is the abstraction used to model hash functions 
in symbolic (Dolev-Yao) security analysis. In this setting it is axiomatized that 
an adversary can compute some hash only when it knows the underlying value. 
Clearly, malleable hash functions do not satisfy this axiom. Therefore, non- 
malleability for hash functions is necessary in order to ensure that symbolic 
analysis is (in general) sound with respect to the standard cryptographic model. 
Otherwise, real attacks that use malleability can not be captured/discovered in 
the more abstract symbolic model. 

In a different vein, and from a more conceptual perspective, higher-level pro- 
tocols could potentially benefit from non-malleable hash functions as a building 
block. A recent concrete example is the recommended use of such non-malleable 
hash functions in a human-computer interaction protocol for protecting local stor- 
age mi. There, access should be linked to the ability to answer human-solvable 


In the sequel we aggregate both one-way functions and hash functions under the 
term hash functions for simplicity. 
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puzzles (similar to CAPTCHAs), but it should be infeasible for a machine to maul 
puzzles and redirect them under a different domain to other human beings. 

We will also discuss a construction of a cryptographic puzzle from j2S| de- 
signed to prevent DoS attacks, and show that malleability of the underlying 
hash function leads to insecure constructions. 

Hence, non-malleability is a useful design principle that designers of new hash 
functions should keep in mind. At this point, however, it is not even clear what 
the exact requirements from a theoretical viewpoint are. Therefore, a first neces- 
sary step is to find a suitable definition which is (a) achievable, and (b) applica- 
ble. The next step would be to design practical hash functions and compression 
functions which are non-malleable, or which at least satisfy some weaker variant 
of non-malleability. 

Contributions. In this paper we initiate the study of non-malleable hash 
functions. We start with the design of an appropriate security definition. Our 
definition uses the standard simulation paradigm, also employed in defining non- 
malleability for encryption and commitment schemes. It turns out however that 
a careless adjustment of definitions for other primitives yield definitions for non- 
malleable hash functions that cannot be realized. We therefore motivate and 
provide a meaningful variation of the definition which ensure that the notion is 
achievable and may be useful in applications. 

Testifying to the difference to other cryptographic primitives, we note that 
for non-malleable encryption the original simulation-based definition of d|was 
later shown to be equivalent to an indistinguishability-based definition j^| . For 
our case here, finding an equivalent indistinguishability-based definition for non- 
malleable hash functions appears to be far from trivial, and we leave the question 
as an interesting open problem. 

We then show that our definition can be met. Our construction of a non- 
malleable hash function employs a perfectly one-way hash function (POWHF) 
jbll2| , i.e., a probabilistic hash function which hides all information about its pre- 
image. Notice that this form of secrecy in itself does not ensure non-malleability, 
so we make the function non-malleable by appending a simulation-sound non- 
interactive zero-knowledge proof of knowledge (NIZKPoK) |2bll4| of the hashed 
valucQ Both primitives exist, for example, if trapdoor permutations exist 0 

The construction we provide is probabilistic and does not achieve the desired 
level of efficiency for practical applications. We emphasize that our construction 
should be regarded as a feasibility result that shows that, in principle, non- 
malleable hash functions can be built from standard assumptions. We leave open 


2 Analogously to Canetti’s terminology of perfectly one-way hash functions [2] we refer 
to our construction as a hash function since we require collision resistance, although 
it does not compress. 

3 We remark that the intuitively appealing approach of using non-malleable encryption 
or commitment schemes to directly construct non-malleable hashes does not work. 
One of the reasons is that the former primitives rely on secret randomness, whereas 
hash values need to be publicly verifiable given the pre-image. 
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the problem of finding a practical, deterministic solution. We note that our 
definition is general enough to allow such constructions. 

Next, we investigate necessary cryptographic assumptions for building non- 
malleable functions. We provide two results. First we show that a non-malleable 
hash function needs to hide any information about the pre-image. This result 
justifies the use of POWHFs in our construction. Then we show (in the style of 
Impagliazzo-Rudich P3) that black-box constructions of non-malleable one-way 
functions from one-way permutations are in fact impossible even if the collision- 
resistance requirement is dropped. To be more precise, we follow the approach of 
Hsiao and Reyzin m and show that no black-box security reduction is possible. 
Notice that our construction circumvents the impossibility result due to the use 
of a “non-black-box” NIZKPoK. 

Finally, we study the applicability of our definition. We show that 
non-malleability is in fact sufficient for secure partial instantiation of the afore- 
mentioned encryption scheme of Bellare and Rogaway 0, i.e., that the scheme 
remains IND-CCA secure when H is replaced with a non-malleable hash func- 
tion. Although G is still a random oracle, this partial instantiation helps to 
better understand the necessary properties of the primitives and also provides a 
better security heuristic. 

We also sketch an application to the framework of cryptographic puzzles m 
as a defense against DoS attacks, where non-malleability surfaces as an important 
property. The usefulness of the definition has also been shown in ca, using a 
special case of a preliminary version of our definition to prove that HMAC |3j is 
a secure message authentication code, assuming that the compression function 
of the hash function is non-malleable. We expect further applications of non- 
malleable hash functions in other areas, and some of the techniques used in our 
proof here may be helpful for these scenarios. 

Related Work. Independently of our work, Canetti and Dakdouk m and 
Pandey et al. m recently also suggested one-way functions with special prop- 
erties related to, yet different from non-malleability, and Canetti and Varia m 
investigated non-malleable obfuscation. The work of Canetti and Dakdouk m 
introduces the notion of extractable perfect one-way functions where generating 
an image also guarantees that one knows a preimage. This should even hold if 
an adversary sees related images, a setting which somewhat resembles the one 
that we give for non- malleability. Yet, extractability in m is defined by requir- 
ing the existence of a knowledge extractor which generates a preimage from the 
adversary’s view, including the other images. In contrast, the common approach 
to non-malleability (which we also adopt) is to deny the simulator access to the 
other images, in order to capture the idea that these images should not help. 
Hence the security definition from m is incomparable to ours. Moreover using 
the notion of urn to show insecurity of candidate practical hashes seems diffi- 
cult: arguing about the success of an attacker under their definition involves, in 
particular, showing that it is impossible to extract a pre-image when someone 
produces an image. In contrast, security as defined by our notion is easier to 
refute. For example, the hash functions from jjj for which flipping a bit in the 
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pre-image results in flipping a bit in the image are clearly insecure under our 
definition. 

The work by Pandey et al. jSOI defines adaptive one-way function families 
where inversion for an image under some key is still infeasible, even if one is 
allowed to obtain preimages under different keys. This notion is also related to 
non-malleability and turns out to be useful to design non-malleable protocols 
like commitments and zero-knowledge proofs. Unfortunately, this strong notion 
is not known to be realizable. 

It is noteworthy that, analogously to our work here, both papers choose the 
Bellare-Rogaway encryption function as an important test case, and succeed 
in instantiating the second random oracle of the scheme. Together with the 
notion that we develop in this paper, these give three different alternatives for 
the requirements needed for this instantiation. Those works also show that the 
first random oracle could be instantiated in the standard model with a function 
which in addition to the notions they define is also pseudorandom. Unfortunately, 
no construction from standard assumptions that meets either one of the two 
resulting notions is known. In contrast, our single-oracle instantiation through a 
non-malleable hash function is possible under standard assumptions. 

The work by Canetti and Yaria d independently considers the notion of 
verifiable non-malleable obfuscation where an adversary, given an obfuscated 
circuit, tries to produce an (obfuscated) circuit which is functionally related. 
The adversary’s success is measured against the success of a simulator given 
only an oracle implementing the original circuit functionality. Their notion of 
verifiable non-malleable obfuscators comes closest to our notion of non-malleable 
hash functions, and their construction for achieving a weaker notion of verifiable 
non-malleable obfuscation resembles our feasibility construction closely. 

The two notions are, nonetheless, different in spirit. For obfuscators the adver- 
sary’s task is to find something functionally related, whereas for non-malleable 
hash functions the adversary’s task is to find a hash of a related pre-image, thus 
capturing relations about specific values like relations among the bits. There are 
further technical differences like the fact that the (achievable) notion of weakly 
verifiable non-malleable obfuscators does not support auxiliary information — as 
required for our encryption case, for example — making the two notions incom- 
parable. More details are given in Section 0 

2 Preliminaries 

Definition 1 (Hash Functions). A hash function H = (HK, H, HVf) consists 
of PPTAs for key generation, evaluation and verification, where 

— PPTA HK for security parameter l k outputs a key K (which contains l k and 
implicitly defines a domain Dk), 

— PPTA H for inputs K and xGD k returns a value y € {0, 1}*, 

— PTA HVf on inputs K, x, y returns a decision bit. 

It is required that for any K <— HK(l fc ), any x £ Dk, any y +— HfiU x), algo- 
rithm HVf (K, x, y) outputs 1. 
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Note that we consider a very general syntax, comprising the “classical” notions 
of one-way functions (with a public key) and of collision-resistant hash functions 
which compress the input to a shorter digest (see |22! for definitions). In our 
case the evaluation algorithm H may be probabilistic, as long the correctness 
of hash values is verifiable given the pre- image only (via HVf). Also, we do not 
demand the length of the output of the hash function to be smaller than that of 
the input. However, while we capture a large class of primitives, the generalized 
syntax may not preserve all properties of the special cases, e.g., if the evaluation 
algorithm is probabilistic, two independent parties hashing the same input will 
not necessarily get the same value. 

We now recall the definitions of one-wayness and collision resistance. For one- 
wayness the definition that we give is more general than the standard one in that 
it considers specific input distributions X for the function, and also accounts for 
the possibility that the adversary may have some partial information about the 
pre-image (modeled through a probabilistic function hint): 

Definition 2 (One-wayness and Collision- resistance). A hash function 
H = (HK, H, HVf) is called 

- one-way (wrt X and hint) if for any PPTA A the probability that for K 4- 
HK(l fe ), x 4- X(l k ), h x *4 hint {K,x), y A W(K,x) and x* *4 A(K,y,h x ) 
we have H\/f(K,x*,y) = 1, is negligible. 

— collision-resistant if for any PPTA A the probability for K <— HK(l fc ), 
{x,x',y) 4- A(K) that x ^ x’ but HVf (K,x,y) = 1 and HVf (K,x',y) = 1, 
is negligible. 

3 Non-malleability of Hash and One-Way Functions 

Our definition for hash functions follows the classical (simulation-based) ap- 
proach for defining non-malleability era, Informally, our definition requires that 
for any adversary which, on input a hash value y, finds another value y* such 
that the pre-images are related, there exists a simulator which does just as well 
without ever seeing y. 

In the adversary’s attack we consider a three-stage process. The adversary 
first selects a distribution X from which a secret input x is then sampled (and 
passes on some state information). In the second stage the algorithm sees a 
hash value y of this input x, and the adversary’s goal is to create another hash 
value y* (usually different from y). In the third stage the adversary is given x 
and now has to output a pre-image x* to y* which is “related” to x (we make 
the definition stronger by giving the challenge pre-image to the adversary). The 
simulator may also pick a distribution X according to which x is sampled, but 
then it needs to specify x* directly from the key of the hash function only. 

In the second stage the adversary (and consequently the simulator) also gets 
as input a “hint” h x about the original pre-image x, to represent some a-priori 
information potentially gathered from other executions of other protocols in 
which x is used. In fact, such side information is often crucial for the deployment 
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in applications, e.g., for the encryption example in Section El As in the case of 
non-malleable commitments and encryption, related pre-images are defined via 
a relation R(x, x*). This relation may also depend on the distribution X to catch 
significantly diverging choices of the adversary and the simulator and to possibly 
restrict the choices for X , say, to require a certain min-entropy. However, unlike 
for other primitives, we do not measure the success of the adversary and the 
simulator for arbitrary relations R between x and x*, but instead restrict the 
relations to a class 1Z of admissible relations. We discuss this and other subtleties 
after the definition: 


Definition 3 (NM-Hash). A hash function hi = (HK, H, HVf) is called non- 
malleable (with respect to probabilistic function hint and relation class if for 


any PPTA A = (Ad, A y , A x ) there exists 
every relation R&1Z the difference 

Pr [ Exp^W = 1 ] - Pr [ Exp^- 

Experiment Exp^f^’* (fe) 
A'4lHK(l fc ) 

( X , std) 4- Ad{K) //for state st d 
x 4- X(l k ), h x 4- hint(i^, x) 
y 4- Y\(K, x) 

( V*,sty ) 4- Ay(y, h x , st d ) 
x* ^-A x (x,st y ) 

Return 1 iff 
R(X,x,x*) 

A (a :,y) / (x*,y*) 

A HVf (AT, x*,y*) = 1 


a PPTA S = (Sd,S x ) such that for 

°(k) = 1 J is negligible, where: 

Experiment Ex.p'ff n g'°(k) 

Kt- HK(l fe ) 

(X,st d )^-S d (K) 
x 4- X(l k ), h x 4- hint(iT, x) 


x* 4- S x (h x , st d ) 
Return 1 iff 
R(X,x,x*) 


Remark 1. Our definition is parameterized by a class of relations 1Z. This is 
because for some relations the definition is simply not achievable, as in the case 
when the relation involves the hash of x instead of x itself. More specifically, 
consider the relation R(x, x*) which parses x* as K. y and outputs HVf (AT, x, y). 
Then, an adversary on input y, h x . st<j may output y* *— H (K, ( K , y)) and then, 
given x, returns x* = ( K , y). This adversary succeeds in experiment Exp^^" 1 (k) 
with probability 1. In contrast, any simulator is likely to fail, as long as the 
hash function does not have “weak” keys, i.e., keys for which the distribution of 
generated images is ttrivial (such that the simulator can guess y with sufficiently 
high probability). 

We resolve this problem by requiring the definition to hold for a subset 1Z of 
all relations. It is of course desirable to seek secure constructions with respect 

4 Throughout the paper all hint functions and relations are assumed to be efficient. We 
furthermore assume that the security parameter is given in unary to all algorithms 
as additional input (if not mentioned explicitly) . 
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to very broad classes of relations (cf. our construction in Section EJ which are 
more handy for general deployment. At the same time, certain scenarios may 
only require non-malleability with respect to a small set of relations (cf. the 
application example discussed in SectionEJ). Our definition is general and permits 
easy tuning for the needs of a particular application or a class of applications. 

Remark 2 . For virtually all “interesting” functions 7t and relation classes TZ the 
definition is achievable only for adversaries and simulators that output descrip- 
tions of well-spread distributions X (i.e., with super-logarithmic min-entropy) . 
For the construction in next section we also require hint to be a so-called unin- 
vertible function 0 (for which finding the exact pre-image is infeasible). Note 
that uninvertibility is a weaker requirement than one-wayness, as it holds for 
example for constant functions. We prefer to keep the definition as general as 
possible, so we do not explicitly impose such restrictions on the adversary, sim- 
ulator, and hint. 

Remark 3. In our definition we demand that the simulator outputs x* given 
K and h x only. A weaker condition would be to have a simulator S y (h x . st,i) 
first output y*, like the adversary A y , and then x* <— S x (x, st, y ), before checking 
that R(X,x,x*) and that HVf(A, x* ,y*) = 1. Since in this case the simulator 
in the second stage is also given x we call this a weak simulator and hash func- 
tions achieving this notion weakly non-malleable. This distinction resembles the 
notions of non-malleable commitments with respect to commitment and with re- 
spect to opening jlbfcul . Depending on the application scenario of non-malleable 
hash functions the stronger or weaker version might be required. As an exam- 
ple, the result about the Bellare-Rogaway encryption scheme uses the stronger 
definition above, and our construction in the next section achieves this stronger 
notion, which obviously implies the weaker one. 

Remark 4. Similarly to the previous variation one can let the adversary only 
output a hash value y*, and omit the step where it later also has to give x*. 
The simulator’s task, too, is then to only output a hash value. Then one defines 
meaningful relations through existential quantifications (“. . . if there exists a pre- 
image x* such that R(x, x*) holds”). This is essentially the approach taken by 
Canetti and Varia m for (weakly) verifiable non-malleable obfuscators. 

On the one hand the “hash-only” approach above facilitates the adversary’s 
task if it does not need to know a specific pre-image. On the other hand, it also 
simplifies the simulator’s task. As an example the adversary in our definition may 
decide upon a specific x* satisfying the relation, after seeing x. Security against 
such an attack cannot be captured by the above notion of relaxed simulators, 
whereas the simulator in our defintion also needs to find an appropriate x*. 
This particular example demonstrates that our approach and the definition for 
(weakly) verifiable non-malleable obfuscators in jEj are incomparable. Further 
differences between the notions are the lack of auxiliary information and the 
dependency of the simulator on the relation in the definition of Canetti and 
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Varia H3 In addition, the feasilibility results presented later in our paper and 
the solutions in £3 are for incomparable classes of relations. 

Remark 5. Note that we only demand that ( x,y ) ^ (x* . y*) for the adversary’s 
choice (instead of demanding x ^ x* or y y* instead), yielding a stronger 
definition, especially when the randomized hash function has multiple images 
for some input. Again, the particular need depends on the application and our 
solution meets this stronger requirement. 

Remark 6. In the case of non-malleable encryption the original simulation-based 
definition of m was later shown to be equivalent to an indistinguishability-based 
definition 0 . The superficial similarity between our definition of non-malleable 
hash functions and the one of non-malleable encryption suggests that this may 
be possible here as well. Surprisingly, straightforward attempts to define non- 
malleability of hash functions through indistinguishability do not seem to yield 
an equivalent definition. We discuss this issue in the full version jH| in more detail 
(because of lack of space), and leave it as an interesting open problem to find a 
suitable indistinguishability-based definition for non-malleable hash functions. 

Remark 7. The usual security notions for hash functions include one-wayness 
and collision-resistance. However, neither property is known to follow from Def- 
inition 0 Consider a constant function H which is clearly not one-way nor 
collision-resistant. But the function is weakly non-malleable as a simulator can 
simulate A in a black-box way by handing the adversary the constant value. We 
keep these rather orthogonal security properties separate, as some applications 
may require one but not the others. 

Remark 8. Some applications (like the HMAC example in £3) require a multi- 
valued version of the definition in which the adversary can adaptively generate 
several distributions and receive the images (with side information) before de- 
ciding upon y*. One can easily extend our definition accordingly, letting Ad loop 
several times, in each round i generating a distribution A) and receiving y t and 
h Xi at the beginning of the next round and before outputting an image y*. In 
general, it is possible to extend our construction to this case using stronger, 
adaptive versions of POWHFs and NIZKPoKs. See Remark 1 after Theorem £] 


4 Constructing Non-malleable Hash Functions 

In this section we give feasibility results via constructions for non-malleable hash 
functions. The main ingredient of our constructions is a perfectly one-way hash 
function (POWHF) [Dll 2 ) . which hides all information about the pre-image but 
which may still be malleable £] . To ensure non-malleability we tag the hash value 
with a simulation-sound non-interactive zero-knowledge proof of knowledge of 
the pre-image. We first recall the definitions of these two primitives. 

For POWHFs we slightly adapt the definition from [91 1 21 to our setting. Orig- 
inally, POWHFs have been defined to have a specific input distribution X (like 
the uniform distribution in [12118) 1. Here we let the adversary choose the input 
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distribution adaptively, and merely demand that this distribution X satisfies 
a certain efficient predicate P p0 w(X)-. this is analogous to the non-malleability 
experiment in which the adversary chooses X and the relation R takes X as 
additional input. We call the side information here aux (as opposed to hint for 
non-malleability) in order to distinguish between the two primitives. In fact, in 
our construction aux uses hint as a subroutine but generates additional output. 


Definition 4 (POWHF). A hash function V = (POWK, POW, POWVf) is 

called a perfectly one-way hash function (with respect to predicate P pow and proba- 
bilistic function aux,) if it is collision resistant, and if for any PPTA B = ( B d , Bb), 
where Bb has binary output, the following random variables are computationally 


indistinguishable: 

Kt- POWK(l fc ); x^-X(l k ) 
a x 4 - au x(K, x) ; y 4 - POW(JsT, x) 
b+- B b (y,a x ,st d ) 
return ( K , x, b) if P p0 w(X) = 1 
else J* 


K 4 - POWK(l fc ) 

(X, st d ) 4 - B d {K) 

x 4 - X(l k ), x' 4 - X(l k ) 

a x A aU x(lf, x) ; y' A POW(AT, x') 

b 4 - B b {y', a x , st d ) 

return ( K , x, b) if P pow {X ) = 1 

else A. 


Remark 1. As pointed out in the definition only makes sense if aux is an 
uninvertible function of the input (such that finding the pre-image x from a x is 
infeasible) and B x only outputs descriptions of well-spread distributions (with 
super-logarithmic min-entropy) . Otherwise the notion is impossible to achieve. 
For generality, we do not restrict X and aux explicitly here. 

Remark 2. Perfectly one-way hash functions (in the sense above) can be con- 
structed from any one-way permutation |1 211 8j (for the uniform input distribu- 
tion), any regular collision-resistant hash function [E| (for any distribution with 
fixed, super-logarithmic min-entropy), or under the decisional Diffie-Hellman 
assumption jOJ (for the uniform distribution). Usually these general construc- 
tions are not known to be secure assuming arbirtrary functions aux, yet for the 
particular function aux required by the application they can often be adapted 
accordingly. A concrete example is given in Section El in our discussion of the 
Bellare-Rogaway encryption scheme. 

On the choice of the relation class. Recall that the definition of non- 
malleability is parametrized by a class of relations. As explained earlier in the 
paper, no non-malleable hash function for an arbitrary class exists (see Remark 1 
after Definition El • In the sequel, we exhibit a class of relations for which we show 
how to construct non-malleable hash functions, and then present our provably 
secure construction. 

Specifically, we consider the class of relations parameterized by an 

optional function rinfo and which consists of all relations of the form R(x, x*) = 
P(x, P*(rinfo(.x'), a:*)), for all efficient predicates P, P* 0 The function rinfo(x) 

5 Where we neglect the distribution X as part of the relation’s input for the moment. 
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may be empty or consist of a small fraction of bits of x (e.g., up to logarithmically 
many), and should be interpreted as the information about x that may be used 
in evaluating the relation R. It is important that rinfo is an univertible function, 
as otherwise, if one could recover x from rinfo(x), then would comprise 

all efficient relations, R(x, x*) = P*( x, x*), and non-malleability with respect to 
this class, again, would not be achievable. 

As an example consider the empty function rinfo such that TZ pie d consists of 
all relations R(x,x*) = P(x. P* (x* j). This class of relations allows to check for 
instance that individual bits of x and x* are complement of each other, i.e., if 7 Tj 
denotes the projection onto the j'-th bit then one sets P*(x*) = nj (x* ) and lets 
P(x,P*(x*)) output 1 if 7 Tj(x) ^ 7r j (x* ) . This example has also been used by 
Boldyreva and Fischlin |Zj to show the necessity of non-malleability for OAEP, 
and to give an example of a perfectly one-way hash function that is malleable in 
the sense that flipping the first bit of an image produces a hash of the pre-image 
whose first bit is also flipped. 

In the examples above rinfo has been the empty function. Of course, using 
non- trivial functions rinfo allows for additional relations and enriches the class 
^■pred- Consider for example a hash function H that is malleable in the sense that 
an adversary, given H(K, r\\m) for random r £ {0, l} k , can compute H(K. r\\m') 
for some m' 7 ^ m. One way to capture that the two pre-images coincide on the 
first k bits is to set rinfo(rj|m) = r and to set P*(r,x*) = 1 if and only if r 
is the prefix of x*. Since rinfo should be univertible, the function should rather 
return only a fraction of r, though. Similarly, one can see that the class 
“captures” relations like R(x,x*) = 1 iff x ® x* = 6 for some constant <5, and 
many other useful relations. 

Finally, we note that each relation from the class also checks that the chosen 
input distribution X “complies” with the eligible distributions from the under- 
lying POWHF. That is, each relation also checks that the predicate P p0 w(X) 
of the POWHF is satisfied. The full relation R(X,x,x*) then evaluates to 1 iff 
P(x, P*(rinfo(a;), x*)) = 1 and P p0 w(X) = 1 . More formally, for any predicate 
Ppow and uninvertible function rinfo we define the class of relations: 



jj there exist efficient (probabilistic) predicates P, P* 

such that R(X,x,x*) = P(x, P*(rinfo(a;), x*)) A P pcm (X) 


Our construction also uses a simulation-sound zero- knowledge proof of knowledge 
II = (CRS, P, V) for the NP-relation P pow defined by: 


tfpow = {(ifpowl 12/pow, x\ |r) : PO\N(K pow , X] r) = y pmi } . 


which essentially says that one “knows” a pre-image of a hash value. Simulation- 
sound NIZK proofs of knowledge for such relations can be derived from trapdoor 
permutations |!29ll4j . We recall the definition of the former in the full version. 

The CONSTRUCTION and its security. The following theorem captures the 
security of our construction. 
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Theorem 1. Let V = (POWK, POW, POWVf) be a perfectly one-way hash func- 
tion with respect to P pow and aux, where aux = (hint, rinfo) for probabilistic func- 
tions hint and rinfo. Let LI = (CRS,P,V) be a simulation-sound non-interactive 
zero-knowledge proof of knowledge for relation R pow . Then the following hash 
function TL = (HK, H, HVf) is non-malleable with respect to hint and : 

— PPTA HK on input l k samples K pow 4 - POWK(l fe ) and crs 4 - CRS(l fc ) and 
outputs K = (K pow ,crs). The associated domain Dk is given by Dk pow - 

— PPTA H on input K and x G Dk computes y pow <— PO\N(K pow . x: r) for 
random r 4 - RND Kpov , as well as n 4 - P(crs,K pow \\y pow ,x\\r). It outputs 

V = ( Vpow j 7r) . 

— PTA HVf for inputs K = (K pow , crs), x and y = (y p0 w, tt) outputs 1 if and 
only if 

POWVf (K pow , x, y pom ) = 1 and V(crs, K pom \\y pow , n) = 1. 

In addition, H is collision-resistant. 

Due to space limitations we provide the detailed proof in the full version of the 
paper 0. 

Remark 1. The malleability adversary has access to essentially two different 
sources of partial information about x: hint(a;) which it receives explicitly as 
input, and rinfo(x) which it can use indirectly through the relation R. This 
motivates the requirement that V be perfectly one-way with respect to partial 
information aux = (hint, rinfo). 

Remark 2. As mentioned after the definition of non-malleable hash functions, 
some applications (like the one about HMAC HU) may require a stronger no- 
tion in which the adversary can adaptively generate distributions and receives 
the images, before deciding upon y*. Our construction above can be extended 
to this case, assuming that the POWHF obeys a corresponding “adaptiveness” 
property and that the zero-knowledge proof of knowledge is multiple simulation- 
sound and multiple zero-knowledge. Such adaptively-secure POWHFs (for uni- 
form distributions) can be built from one-way permutations HU and suitable 
zero-knowledge proofs exist, assuming trapdoor permutations |2?)I14I . 

5 On the Complexity of Non-malleable Functions 

In this section we discuss the existential complexity of non-malleable functions. 
We first indicate, via an oracle separation result, that deriving non-malleable 
hash and one-way functions via one-way permutations is infeasible. In the full 
version |U we also discuss the relation between non-malleability and one-wayness. 

5.1 On the Impossibility of Black-Box Reductions 

We first show that, under reasonable conditions, there is no black-box reduction 
from non-malleable hash functions (which might not even be collision-resistant 
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but rather one-way only) to one-way permutations. For space reasons most of 
the proofs have been moved to the full version of the paper |SJ. 

Black-Box Reductions. In their seminal paper Impagliazzo and Rudich m 
have shown that some cryptographic primitives cannot be derived from other 
primitives, at least if the starting primitive is treated as a black box. Instead of 
separating primitives as in PI here we follow the more accessible approach of 
Hsiao and Reyzin E3 giving a relaxed separation result with respect to black- 
box security reductions. We give a formalization of the oracle-based black-box 
separation approach that we use in the full version. 

For our result we assume that the algorithms of the hash function TL are 
granted oracle access to a random permutation oracle V (which is one-way, of 
course). A black-box reduction to V is now an algorithm which, with oracle ac- 
cess to V and a putative successful attacker A on the non-malleability property, 
inverts V with noticeable probability. Such an attacker A may take advantage of 
another oracle O (related to V) which allows it to break the non-malleability but 
does not help to invert the one-way permutation V. Since neither the construc- 
tion nor the reduction are given access to O, the reduction must be genuinely 
black-box. 

Defining Oracles V and O. For now we let V be a random permutation 
oracle which in particular is a one-way function. Below we show through de- 
randomization techniques that some fixed V must also work. For our separa- 
tion we let the side information of the non-malleable hash function include 
an image of the uniformly distributed input x under V . More precisely, con- 
sider the function hint^ p which on input (l fc , K, x) for random x computes 
h x = V(Q k \\x\ \ (HVf) \\K) for the description (HVf) of the verification algorithm 
and finally outputs /i K 0 

We next construct the oracle O that helps to break non-malleability. The 
idea is that using O it is possible to extract from the image y and “hint” h x 
(described above) the pre-image x of y. Since the adversary gets y as input, but 
the simulator does not, the oracle is only helpful to the adversary. Note that 
breaking non-malleability means that no simulator of comparable complexity is 
able to approximate the success probability of A v, ° closely. To ensure that the 
simulator has the equal power as A v, ° we grant the simulator S v, ° therefore 
access to both oracles V, O. 

Construction 1. Let oracle O take as input a parameter l k , an image y and 
a “hint” h x . The oracle first finds the pre-image s||a;|| (HVf) \\K of h x under V 
and verifies that z = 0 k ; if not it immediately returns _L. Else it checks that 
HVf v (K,x,y) = 1 and returns x if so (and outputs T otherwise). 


We note that the side information h x does not reveal any essential information about 
x in the sense that one can show that, for any non-malleable hash function for the 
uniform input distribution and no side information at all, the hash function remains 
non-malleable with respect to h x relative to the random permutation V (but not 
relative to O, of course). Also observe that the common strategy of using black-box 
simulators usually works for any side information, and in particular for the one here. 
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We show that O does not help to invert V, thus showing that relative to the 
oracles there still exists one-way permutations: 

Proposition 1. For any efficient algorithm B 1 > ? , the probability that B v, ° breaks 
the one-wayness ofV is negligible. 

In light of this lemma we conclude that there exists a particular V that is hard 
to invert for all PPT adversaries with oracles V, O. The argument is the same as 
in pi- For a fixed PPT adversary B , we define the sequence of events (indexed 
by k ) where B inverts strings of length k with some good probability; for a 
suitable choice of parameters, the sum of the probabilities (over V) of these 
events converges and by the first Borel-Cantelli lemma only finitely many of 
these events may occur, almost surely. Then taking the countable intersection 
over all PPT B, we get that there is at least one V with the desired property. 

Separation. We require some mild, technical conditions for our non-malleable 
hash function and the relation. Namely, we assume that 

— the hash function is non-trivial meaning that it is infeasible to predict an 
image for uniformly distributed input over {0, l} k (thus ruling out trivial 
examples like constant hash functions), and 

— the relation class 1Z contains the relation R sep which on input (X, x, x*) 
checks that X is the uniform distribution on {0, l} fe , and that parity(x) = 

= parity (as-*) = 0 x* . Note that R sep e 7£ pre a for our predicate-based 
relations, even for the empty function rinfo, and can thus be achieved in 
principle. 

Theorem 2. Let Tl v = (HK^, H 7 ’, HVf 73 ) be a non-trivial non-malleable hash 
function with respect to hint^ p and 1Z 9 R sep . Then there exists an adversary 
A v, ° that breaks non-malleability ofH v (for any simulator S v, ° ). 

Corollary 1. There exists no black-box reduction from non-trivial non-malleable 
functions (with respect to hint^ p and 1Z 9 R sep ) to one-way permutations. 

At first glance it seems as if our result would transfer (after some minor mod- 
ifications) to other non-malleable primitives like commitments. This is not the 
case. The oracle O in our construction relies on the ability to check whether a 
pre-image x matches an image y (public verifiability of hash functions), while 
other primitives such as encryption £(m;r) and commitments Com(m; r) use 
hidden randomness (which is not part of the input of function hint). 

Relating Non-Malleability and Perfect One-Wayness. In the full ver- 
sion we show that non-malleability implies a variant of perfect-one- wayness. 

6 Applications 

In this section we study the usefulness of our notion for cryptographic applica- 
tions. As an example we show that when one of the two random oracles in the 
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aforementioned encryption scheme proposed by Bellare and Rogaway in jlj is 
instantiated with a non-malleable hash function, the scheme remains IND-CCA 
secure. In addition, we argue that non-malleability is useful in preventing off-line 
computation attacks against a certain class of cryptographic puzzles. 

Instantiating random ORACLES. We start with recalling the scheme. Let T 
be a familiy of trapdoor permutations and G, H be random oracles. The message 
space of the scheme BR G,ff [T] = (/C, £, T>) is the range of G. The key generation 
algorithm /C outputs a random ^"-instance f and its inverse f -1 as the public 
and secret key, respectively. The encryption algorithm £ on inputs f and m 
picks random r in the domain of f (we assume that r £ {0, l} fe ) and outputs 
(f(r), G(r) ® rn. H(r\ \ to)). The decryption algorithm on inputs f _1 and (y. g, h ) 
first computes r «— f then G(r), and outputs m iff H(r\\m) = h. 

The scheme BR G,i? [.F] is proven to be IND-CCA secure in the random oracle 
model assuming that T is one-way 0- 

Here we study the possibility of realizing the random oracle H with an actual 
hash function family ki = (HK,H,HVf), a so-called partial H -instantiation of 
the scheme. More precisely, we modify the scheme so that the public key and 
secret key also contain a key K +— HK(l fc ) specifying a function. Then £ com- 
putes H(/T, r\\m ) instead of H(r\\rn), and V computes HVf(A, r||m, h) instead of 
checking that H(r\\rri) = h. We refer to this scheme as BR G,W [JT]. The following 
shows that functions that meet our notion of non-malleability are sufficient for 
a secure partial //-instantiation. 

Before stating the sufficient conditions for security to hold, we fix some nota- 
tion. Below we let the function rinfoBR(a;) = msbfe/ 2 (a;) output the k/ 2 most 
significant bits of its input. The class of relations we require here for non- 
malleability is only a subset of the achievable class discussed in Section El 
Namely, we only require a relation of the form Abr(A, x,x*) = P* (rinfosR^) , x*) 
APpow(A’), where P pow is the predicate that checks that X is the canonical rep- 
resentation of the uniform distribution on the first k bits, and P* is the pred- 
icate that simply verifies that msb/-/ 2 (a;*) = rinfoBR^). We choose this specific 
predicate Rbr so that it can check if x = x*, while erring with only negligible 
probability, but still admit the construction of non-malleable hash functions. 

Below we will require that the trapdoor permutation family is rnsb/j / 2 -partial 
one-way, meaning that it is hard to compute the k/2 most significant bits of 
the random input r given a random instance f and f(r) (cf. m for the formal 
definition). This is a rather mild assumption to impose on T . For example, 
RSA was shown to be partial one-way under the RSA assumption in EH- a 
general approach to construct such a partial one-way family T is to define f (r) = 
g(msbfc/ 2 (r))||c/(lsbfe/ 2 (r)) for a trapdoor permutation 

7 In fact, this construction also has the useful property that f(r) is still hard to invert, 
even if given msbfc/ 2 (r). Thus this trapdoor permutation is suitable for constructing 
POWHFs secure with respect to side information (msbfc^fy), ffy)) ar >d therefore, via 
our construction, non-malleable hash functions for side information hintBR(r) = f(r) 
and the relation Rbr- In other words, non-malleable hash functions for hintBR and 
Rbr exist under common cryptographic assumptions. 
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We need one more technical detail before stating the theorem. We start with 
some hash function family 7i = (HK, H, HVf) and trapdoor permutation family 
T. We write 7i = (HKjr. H, HVf) for the modified hash function for which key 
generation outputs a random instance of T along with the original hash key. 
Below we write hintBR for the function that takes as input a key (K. f) and 
string x, and outputs f(r), where r are the first k bits of the input x. We note 
the IND- CPA version of the scheme by Bellare and Rogaway was shown secure 
in the standard model by Canetti j0|, assuming the hash function is a POWHF 
with respect to a similar hint function. 

Theorem 3. Let T be an msb*, /% -partial one-way trapdoor permutation family 
and let PL = (HKjr, H, HVf) be a collision-resistant hash function which is non- 
malleable with respect to the function hints/? and to the relation Rbr- Assume 
further that PL is a perfectly one-way hash function with respect to P pow and 
hints/?. Then BR G,H [fF ] is IND-CCA secure (in the RO model). 

Remark. Although the non-malleability property of the hash implies that no 
partial information about pre-images is leaked (cf. the full version for a formal 
statement of this implication), the theorem above requires the hash to be per- 
fectly one-way in the sense of Definition 0 which is a stronger requirement in 
general. The proof of the theorem is in the full version (Hj . 

Application to cryptographic puzzles. Cryptographic puzzles are a de- 
fense mechanism against denial of service attacks (DoS). The idea is that, before 
spending any resources for the execution of a session between a client and a 
server, the server requires the client to solve a puzzle. Since solving puzzles re- 
quires spending cycles, the use of puzzles prevents a malicious client to engage 
in a large number of sessions without spending itself a significant amount of 
resources. One desirable condition is that the server does not store any client- 
related state. 

A simple construction for such puzzles proposed by Juels and Brainard m 
is based on any arbitrary one-way function h : {0, 1}' — > {0, l} 1 . First, select at 
random x <— {0, l} 1 and compute y = h(x'). Then, a puzzle is given by the tuple 
(a;[l.i — k] , y) consisting of the first l — k bits of x together with y. To prove 
it solved the puzzle, the client has to return ( x,y ). It can be easily seen that 
the construction above is not entirely satisfactory. In particular, it either fails 
against replay attacks — where the clients present the same puzzle-solution pair 
to the server — or the server needs to store all of the a;’s used to compute the 
puzzles. 

The solution proposed to mitigate the above problem is to compute x as 
H(S, t), where S is some large bitstring known only to the server, and t is some 
bitstring that somehow “expires” after a certain amount of time (this can be for 
example the current system time). The puzzle is then given by (t,x[l..l — k],y), 
where y = h(x). A solution (or solved puzzle) is ( t,x,y ) which needs to satisfy 
the obvious equations, and moreover, t is not an expired bitstring. 

In the setting above, non-malleability of H surfaces as an important property. 
If out of the first two elements ( t,H(S,t )) of a puzzle solution the adversary can 
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efficiently construct ( t',H(S,t ')) for t' ^ t, a string which has not yet expired, 
then the defense sketched above is rendered useless: the adversary can easily 
construct new puzzles (together with their solutions). Requiring that the func- 
tion H is non-malleable with respect to the relation R{s\ . S 2 ) = 1 iff si = (S, t) 
and S 2 = (S, t') for t t! is sufficient to prevent the above attack. 
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Abstract. The hash function Skein is the submission of Ferguson et 
al. to the NIST Hash Competition, and is arguably a serious candi- 
date for selection as SHA-3. This paper presents the first third-party 
analysis of Skein, with an extensive study of its main component: the 
block cipher Threefish. We notably investigate near collisions, distin- 
guishers, impossible differentials, key recovery using related-key differ- 
ential and boomerang attacks. In particular, we present near collisions 
on up to 17 rounds, an impossible differential on 21 rounds, a related- key 
boomerang distinguisher on 34 rounds, a known-related-key boomerang 
distinguisher on 35 rounds, and key recovery attacks on up to 32 rounds, 
out of 72 in total for Threefish-512. None of our attacks directly extends 
to the full Skein hash. However, the pseudorandomness of Threefish is re- 
quired to validate the security proofs on Skein, and our results conclude 
that at least 36 rounds of Threefish seem required for optimal security 
guarantees. 

1 Introduction 

The hash function research scene has seen a surge of works since devastating 
attacks PI2101BI on the two most deployed hash functions, MD5 and SHA-1. 
This led to a lack of confidence in the current U.S. (and de facto worldwide) 
hash standard, SHA-2 (jjj, because of its similarity with MD5 and SHA-1. 

As a response to the potential risks of using SHA-2, the U.S. National Institute 
of Standards and Technology (NIST) launched a public competition — the NIST 
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** Supported by GEBERT RUF STIFTUNG, project no. GRS-069/07. 
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Improved Cryptanalysis of Skein 543 


Hash Competition — to select a new hash standard 0. The new hash function, 
SHA-3, is expected to have at least the security of SHA-2, and to achieve this 
with significantly improved efficiency. By the deadline of October 2008, NIST 
received 64 submissions, 51 were accepted as first round candidates, and in July 
2009 14 were selected as second round candidates, including Skein. Due to the 
critical role of hash functions in security protocols, this competition catches the 
attention not only from academia, but also from industry — with candidates from 
IBM, Hitachi, Intel, Sony — and from governmental organizations. 

Skein 0 is the submission of Ferguson et al. to the NIST Hash Competition. 
According to its designers, it combines “speed, security, simplicity and a great 
deal of flexibility in a modular package that is easy to analyze” [3 p.i]. Skein 
supports three different internal state sizes (256-, 512-, and 1024-bit), and is one 
of the fastest contestants on 64-bit machines. 

Skein is based on the “UBI (The Unique Block Iteration) chaining mode” that 
itself uses a compression function made out of the Threefish-512 block cipher. 
Below we give a brief top-down description of these components: 

• Skein makes three invocations to the UBI mode with different tags: the 
first hashes the configuration block with a tag “Cfg” , the second hashes 
the message with a tag “Msg” , and the third hashes a null value with a 
tag “Out”. 

• UBI mode hashes an arbitrary-length string by iterating invocations to 
a compression function, which takes as input a chaining value, a message 
block, and a tweak. The tweak encodes the number of bytes processed 
so far, and special flags for the first and the last block. 

• The compression function inside the UBI mode is the Threefish-512 
block cipher in MMO (Matyas-Meyer-Oseas) mode, i.e., from a chaining 
value h, a message block to, and a tweak t it returns Eh(t,m) ® to as 
new chaining value. 

• Threefish is a family of tweakable block ciphers based on a simple per- 
mutation of two 64-bit words: MIX (a:, y) = (x + y, {x + y) © (y <gC R )). 
Threefish-512 is the version of Threefish with 512-bit key and 512-bit 
blocks, and is used in the default version of Skein. 

So far, no third-party cryptanalysis of Skein has been published, and the only 
cryptanalytic results are in its documentation 0 §9] . It describes a near collision 
on eight rounds for the compression function, a distinguisher for 17 rounds of 
Threefish, and it conjectures the existence of key recovery attacks on 24 to 27 
rounds (depending on the internal state size). Furthermore, 0 §9] discusses the 
possibility of a trivial related-key boomerang attack on a modified Threefish, and 
concludes that it cannot work on the original version. A separate document jH] 
presents proofs of security for Skein when assuming that some of its components 
behave ideally (e.g., that Threefish is an ideal cipher). 

This paper presents the first external analysis of Skein, with a focus on the 
main component of its default version: the block cipher Threefish-512. Table 0 
summarizes our results. 
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Table 1 . Summary of the known results on Threefish-512 (near collisions are for 
Threefish-512 in MMO mode, related-key boomerang attacks make use of four related- 
keys , designates the present paper) 


Rounds 

Time 

Memory 

Type 

Authors 

8 

1 

- 

511-bit near-collision 

m 

16 

2 6 

- 

459-bit near-collision 

V 

17 

2 24 

- 

434-bit near-collision 

V 

17 

2 8 ' 6 

- 

related-key distinguisher* 

m 

21 

2 3 ' 4 

- 

related-key distinguisher 

v 

21 

- 

- 

related-key impossible differential 

s/ 

25 

? 

- 

related-key key recovery (conjectured) 

0 

25 

2 416.6 

- 

related-key key recovery 

sj 

26 

2 507.8 

- 

related-key key recovery 

sj 

32 

2 312 

2 71 

related-key boomerang key recovery 

sj 

34 

2 398 

- 

related-key boomerang distinguisher 

V 

35 

2 478 

- 

known-related-key boomerang distinguisher 

s/ 


*: complexity deduced from the biases in [3 Tab. 22]. 


The rest of the paper is organized as follows: 21 describes Threefish-512; ; £3 
studies near-collisions for Skein’s compression function with a reduced Threefish- 
512; describes impossible differentials; 21 discusses and improves the key- 
recovery attacks sketched in 0 §§9.3], Finally, '£3 uses the boomerang technique 
to describe our best distinguishers and key-recovery attacks on Threefish. SJ3 
concludes. 

2 Brief Description of Threefish-512 

Threefish-512 works on 64-bit words, and we write their hexadecimal value in 
sans-serif font (e.g., 0123456789ABCDEF). The letter A stands for a difference in 
the most significant bit (MSB), i.e., A = 8000000000000000. Notations are the 
same as in the specification of Threefish |Z| §§2.2]: a 512-bit plaintext block is 
parsed as eight words wo, 0 ; ■ ■ ■ ■ t%, 7 , and is encrypted through N r = 72 rounds, 
where round number d £ {0, . . . , N r — 1} operates as follows: 

1. If d = 0 mod 4, add a subkey by setting ed,i <— Vd,i + kd,i, i = 0, ... ,7, 
otherwise, just copy the state ed,i <— Vd,i, i = 0, . . . , 7. 

2- Set (/ d , 2 », fd, 2 i+i) MIXd,i(ed, 2 i, e d , 2 i+i), i = 0, . . . , 3, where 

MIXd,j(a;, y) = (x + y, (a; + </) ® (y <§T R d ,i)) , 

with Rfj.i a rotation constant dependent on d and i. 

3. Permute the state words: 

Vd+1,0 ^ fd , 2 Wd+1,1 <— fd, 1 Vd+ 1,2 •*— fd , 4 Vd+ 1,3 fd, 7 

Vd+ 1,4 •*— fd , 6 Vd+ 1,5 •*— fd , 5 ^d+1,6 fd , 0 ^d+1,7 fd , 3 • 
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After N r = 0 mod 4 rounds, the ciphertext is set to 

(h2Vr,Q + k N r fi), • • ■ , (w Mr, 7 + k N r , 7 ) ■ 


The s-th keying (counting from zero, thus which occurs at round d = 4s) uses 
subkeys k St o, . . . , k St 7 . These are derived from the key ko,...,ky and from the 
tweak to, ti as 



fys+O) mod 5 
<- k (s+l) mod 5 
•*— fys+2) mod 5 
*” k (s+ 3) mod 5 



<— k (s+ 4) mod 5 

k {s+5) mod 5 + t s mod 3 
&(s+6) mod 5 + *(s+ 1) moc 
fys+7) mod 5 + s 


x!3 


where k$ = 5555555555555555® ©J^q ki and t% = to ® t%. 

3 Near Collisions for the UBI Compression Function 

We extend the analysis presented in [3 §9] to find near-collisions for the compres- 
sion function of Skein’s UBI mode; 0 §9] exploits local collisions, i.e., collisions 
in the intermediate values of the state, which occur when particular differences 
are set in the key, the plaintext, and the tweak. 

The compression function outputs Ek(t,m) ® m, where E is Threefish-512. 
Our strategy is simple: like in 0 §9], we prepend a four-round differential trail 
to the first local collision at round four so as to avoid differences until the 13-th 
round. Then, we follow the trail induced by the introduced difference. 

The next two sections work out the details as follows: 

• §EH shows how to adapt the differential trail found in 0 §9] when a 
4-round trail is prepended. 

• ':'tL 2 l describes the differential trails used and evaluates the probability 
that a random input conforms. 

• § 'EH explains how to reduce the complexity of the attack by precomput- 
ing a single conforming pair for the first 4-round trail, and using some 
conditions to speed up the search. 

3.1 Adapting Differences in the Key and the Tweak 

In 0 §§§9.3.4], Skein’s designers suggest to prepend a 4-round trail that leads to 
the difference (0, 0, . . . , 0, A), previously used for the 8 -round collision. However, 
the technique as it is presented does not work. This is because the order of 
keyings is then shifted, and so the original difference in the key and in the tweak 
does not cancel the (0,0,..., 0, A) difference at the second keying. 

Therefore, for differences to vanish at the third keying, one needs a difference 
A in k'j and to, which gives a difference (0, . . . , 0, A) at the second keying, and 
(0, 0, 0, 0, A, 0, 0) after the fourth. The difference in the state after (4+8) rounds 
is thus the same as originally after eight rounds. Note that, as observed in 0 
§§9.4], at least seven keyings separate two vanishing keyings. See Table 0 for 
details. 
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Table 2. Details of the subkeys and of their differences, given a difference A in k 7 and 
to (leading to A differences in ka and 1 2 ) 


• 

d 

k s , 0 

k s , 1 

ks, 2 

k s , 3 

k s ,i k s ,o 

Differences 

ks, 6 

k s ,7 



ko 

fci 

k 2 

fe 

k4 

ko +to 

ko + ti 




0 

0 

0 

0 

0 

A 

0 




ki 

k 2 

k 3 

ki 

ks 

ks + 1\ 

k 7 + 1 2 

k 8 + 1 



0 

0 

0 

0 

0 

0 

0 

A 

2 

<S 

k 2 

k 3 

ki 

ks 

ks 

kj + 1 2 

ka + to 

ko + 2 



0 

0 

0 

0 

0 

0 

0 

0 

3 

12 

I k 3 

ki 

ko 

k 6 

&7 

ka + to 

k 0 + ti 

ki +3 



0 

0 

0 

0 


0 

0 

0 


16 

I &4 

k 3 

ko 

k 7 

ka 

k 0 + ti 

ki + £2 

k 2 +4 



0 

0 

0 

A 

A 

0 


0 

5 

2Q 

I k 5 

ko 

ki 

k 8 

k 0 

ki + £2 

k2 + to 

ks + 5 



0 

0 

A 

A 

0 

A 

A 

0 

6 

24 

ko 

k 7 

ka 

ko 

ki 

k2 + to 

ks + 1\ 

/u4 6 



0 

A 

A 

0 

0 

A 

0 

0 


3.2 Differential Trails 

We now trace the difference when prepending four rounds, i.e., when the differ- 
ence is in k 7 and in to only (and in the plaintext). 

4- Round Trail. To prepend four rounds and reach the difference (0, . . . , 0, A), 
one uses the trail provided in the full version jO] of this paper. The plaintext 
difference is modified by the first keying (the MSB differences in the sixth and 
eighth word vanish). The probability that a random input successfully crosses 
the 4-round differential trail is 2 -33 (either forward or backward). 

12- Round Trail. The second keying adds A to the last state word, making its 
difference vanish. The state remains free of any difference up to the fourth keying, 
after the twelfth round, which sets a difference A in the fifth word state. Table 01 
presents the corresponding trail for up to the 17-th round. After 17 rounds, the 
weight becomes too large to obtain near collisions. On 16 rounds, adding the 
final keying and the feedforward, one obtains a collision on 512 — 53 = 459 bits. 
Likewise, for 17 rounds, a collision can be found on 512 — 78 = 434 bits. 


3.3 Optimizing the Search 

A direct application of the differential trails in the previous section gives a cost 
2 33 to cross the first four rounds; then, after the twelfth round, 
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Table 3. Differential trail (linearization) used for near collisions, of probability 2 24 


Rd| Difference | Pr 

1 0000000000000000 0000000000000000 8000000000000000 0000000000000000 1 
4 1 0000000000000000 8000000000000000 0000000000000000 0000000000000000 1 
8000000000000000 0000000000000000 8000000000000000 0000000000000000 
4 0000000000000000 8000010000000000 0000000000000000 8000000000000000 
8000000000000000 8000000000000000 8000010000000000 8000000000000100 _ ! 

5 8000000000000000 8008010000000400 8000000000000000 8000000000000000 * 2 

JOOOOOIOOOOOOOIOO 0000000100000000 0008010000000400 00000004000000001 8 

6 1 0000000000000000 000A014004008400 0000000000000000 0804010000000100 1 2 

18008010400000400 0000010100000140 800A014004008400 A805018020000100I 18 

7 |8804010000000100 900A016801009402 0000010100000100 8008010420000401 | 2 * * 


• With 16 rounds: complexity is 2 1+5 = 2 6 * * * * * , so 2 39 in total, for finding a 
collision over 459 bits. 

• With 17 rounds: complexity is 2 1+5+18 = 2 24 , so 2 56 in total, for finding 
a collision over 434 bits. 

A simple trick allows us to avoid the cost of crossing the first 4-round trail: note 
that the first keying adds (kg + to) to the sixth state word, and (kg + t\) to 
the seventh; hence, given one conforming pair, one can modify kg, kg, to, t\ while 
preserving the values of (kg + to) and (kg+ti), and the new input will also follow 
the differential trail. It is thus sufficient to precompute a single conforming pair 
to avoid the cost due to the prepended rounds. 

To carry out this precomputation efficiently, a considerable speedup of the 

2 33 complexity can be obtained by finding sufficient conditions to cross the first 

round with probability one (instead of 2 -21 ): 

• A first set of conditions is on the words (v 2 %, t’2i+i): whenever there is a 
nonzero difference at a same offset, the bit should have a different value 
in the first and in the second word (otherwise carries induce additional 
differences). 

• A second set of conditions concerns the differences that do not “collide” : 
one should ensure that no carry propagates from the leftmost bits. 

In total, there are 13 + 8 = 21 such conditions, which lets enough degrees of 

freedom to satisfy the subsequent differential tails. Using techniques like neutral 

bits m, the probability may be reduced further, but the complexity 2 12 is low 

enough for efficiently finding a conforming pair. By choosing inputs according to 

the above conditions, while being careful to avoid contradictions, we can find a 

pair that conforms within a few thousand trials (see Appendix 0 for an example). 

We can now use this pair to search for near collisions. It suffices to pick random 
values for kg and kg, then set to = —kg and t\ = —kg to get a set of 2 128 distinct 
inputs. Experiments were consistent with our analysis, and examples of near 
collisions are given in Appendix |BJ 


548 J.-P. Aumasson et al. 


3.4 Improved Distinguisher 

Based on our trick to cross the first twelve rounds “for free” , we can improve the 
distinguisher suggested in [Zj . This distinguisher exploited the observation of a 
bias 0.01 < £ < 0.05 after 17 rounds (thus leading to a distinguisher requiring at 
least 1/0. 05 2 « 400 samples). jZj suggested to combine it with the prepending of 
four rounds, though no further details were given. Our observations show that 
with the adapted difference in the key and the tweak, a bias about 0.3 exists 
at the 385-th bit, after 21 rounds. We detected this bias using a frequency test 
similar to that in HH §§2.1]. This directly gives a distinguisher on 21 rounds, 
and requiring only about 1/0.3 2 « 11 samples. 

4 Impossible Differentials 

The miss-in-the-middle technique (a term coined by Biham et al. in H2|) , was 
first applied by Knudsen na to construct a 5-round impossible differential for 
the block cipher DEAL. The idea was generalized by Biham et al. H2! to find 
impossible differentials for ciphers of any structure. The idea is as follows: Con- 
sider a cascade cipher E = E@ o E a such that for E a there exists a differential 
(/IP — ► Z\“ Jt ) and for (E^) _1 there exists a differential (A? n — > Af mt ), both 
with probability one, where the equality is impossible (zl“ ut 7^ 2\f ut ). It follows 
that the differential (Z\P — > A? n ) cannot occur, for it requires Z\“ ut = This 
technique can be extended to the related-key setting. For example, related-key 
impossible differentials were found for 8-round AES-192 [T3HT5L 

Below we first present probability-1 truncated differentials on the first 13 
rounds (forward) and on the last seven rounds (backward) of 20-round Threefish- 
512. A “miss-in-the-middle” observation then allows us to deduce the existence 
of impossible differentials on 20 and 21 rounds. 

4.1 Forward Differential 

The first keying (s = 0) adds to the state vo,i, ■ ■ ■ , no, 7 the values ko, fci, . . . , Am, 
&5 + to, ko + ti,k?. Then, the second keying (s = 1) adds k\, . . . , ko, ko + ti, kr + 
t 2 , A:8+l. By setting a difference A in ko , kr,ti and in the plaintext no, 7, we ensure 
that differences vanish in the first two keyings, and thus nonzero differences only 
appear after the eighth round, for third keying. 

The third keying (s = 2) adds k%, ■ ■ ■ , ko, kj + ti, kg + to, ko + t^- Hence the 
difference A is introduced in eg, 4 only. It gives a difference A in / 8 , 4, /s,5> thus in 
U9,2,n9,5. After the tenth round, the state nio,. has the following difference with 
probability one. 

8000000000000000 0000000000000000 8000000000000000 0000000000000000 
0000000000000000 8000040000000000 0000000000000000 8000000000000000 . 

After the twelfth round (before the fourth keying), the state ^12,. has again some 
differences that occur with probability one (the X differences are uncertain, that 
is, have probability strictly below one): 
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XXXXXXXXX4000000 0000000002000000 XXXXXXXXXXXX4000 0000000000000040 
0000000000000000 XXXXXXXXXXXXX100 0000000000000000 XXXXXXXXX4000800 . 

Given this class of differences, after the 13-th round (which starts by making 
the fourth keying) we have the class of differences 

XXXXXXXXXXXXXX40 XXXXXXXXX2000000 XXXXXXXXXXXXX100 XXXXXXXXXXXXXX10 
XXXXXXXXXXXXX800 XXXXXXXXXXXXXXXX XXXXXXXXX2000000 XXXXXXXXXXXXXX40 . 

There are in total 92 bits with probability-1 differences between the 13-th and 
the 14-th round. These differences were empirically verified. 


4.2 Backward Differential 


The sixth keying (s — 5), which occurs 
ciphertext 


Co = V20,0 + &5 
Cl = v 2 Q,i + ke 
C2 = V20,2 + kr 
C3 = U20,3 + ks 


C4 

C5 

C6 

C7 


after the 20-th round, returns the 

= V20 , 4 + ko 
= ^20,5 + k\ + t2 
= U20, 6 + k2 + to 

= ^20,7 + &3 + 5 


By setting a difference A in ke, k-j, t\ (like for the forward differential), and in the 
ciphertext words ci, C2, C5, we ensure that differences vanish in the sixth keying, 
and thus nonzero differences only appear after the 17-th round, when making 
the fifth keying (by computing backwards from the 20-th round) . 

The fifth keying (s = 4), after the 16-th round, subtracts from the state the 
values &4, . . . , kg. ko + t \ , k\ + t 2 , + 4. Hence, the difference A is introduced 

(backwards) in 1.116,2, t.>i6,3, Vi 6 , 5 , Ui6,6- After inverting the 16-th round, we obtain 
with probability one the difference 


XXXXXXXX40000000 0000000040000000 0000000000000000 0000000000000000 
8000000000000000 0000000000000000 XXXXXXXX10000000 0000000010000000 . 


Finally, after inverting the 14-th round, we have the following difference with 
probability one: 

XXXXXXXXXXXX8000 XXXXXXXXXXXX8000 XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX 
XXXXXXXXXXXXXXXX XXXXXXXXXX400000 XXXXXXXXXX800000 XX50000000800000 . 

In total there are 134 bits of difference with probability one between the 14-th 
and the 13-th round. 


4.3 Miss-in-the-Middle 

We showed that if there’s a difference A in the key in kg and k-j, and in the 
tweak in t \ , then a difference A in the plaintext word 1,10,7 propagates to give 
probability-1 differences after up to 13 rounds. Then we showed that for the 
same difference in the key and in the tweak, a difference A in the ciphertext 
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words ci,C 2 ,C 5 guarantees (probability one) that between the 13-th and the 14- 
th rounds we also have probability - 1 differences. 

Looking for example at the first word of the state: the forward differential 
leads to a difference in the seventh bit, whereas the backward differential requires 
this bit to be unchanged. Therefore, it is impossible that a difference A in the 
plaintext vo ,7 leads to a difference A in C\ , c % , c 5 with 20-round Threefish-512. 

We can extend this impossible differential one more round: after the 20-th 
round and the sixth keying the state has only differences A in 620,1 • e- 2 o/ 2 , 620 . 3 - 
These differences always give the same difference after the 21-st round, because 
they are only in MSB’s. This directly gives an impossible differential on 21 
rounds of Threefish-512 (e.g., 21 out of 72). However contrary to the 20-round 
impossible differential, it is irrelevant to Threefish-512 with exactly N r = 21 
rounds, because of the final keying that occurs after the 21 -st round (which 
makes some differences uncertain, because before the keying we have differences 
in non-MSB’s). 

5 Improved Key-Recovery Attacks 

The documentation of Skein sketches key-recovery attacks on all Threefish ver- 
sions, though the complexity is not studied. We analyzed these observations, and 
could find better attacks than conjectured by the Skein designers. 

To optimize the attack strategy in 0 §§9.3], the attacker has to determine 
which key bits should be guessed. This is to minimize the noise over the bias 
after a partial inversion of the last rounds, and thus to minimize the complexity 
of the attack. The less key bits guessed, the better for the attacker (up to the 
bound of half the key bits). One can easily determine which key bits do not 
affect the bias when inverting one or two rounds. For example, two rounds after 
round 21 (where the bias occurs), the 385-th bit does not affect the second, 
third, fourth, and sixth state words. Hence, it is not affected by a wrong guess 
of the key words fco, & 2 , & 6 - The bias is slightly affected by erroneous guesses of 
&3 (which modifies the last state word in the keying), but it is still large (about 
0.12 w 2 -3 ). It is thus sufficient to guess half the key (k \ , fej, fcs, £ 7 ) to be able 
to observe the bias. 

Note that the cost of the prepended rounds depends on which key words 
are guessed: indeed, when guessing a word, one can adapt the corresponding 
plaintext word in order to satisfy the conditions of the differential. Here the non- 
guessed words imply a cost 2 12+18 = 2 30 to cross the first differential. The total 
cost of recovering the 512-bit key on 23 rounds is thus about 2 30 x 2 6 x 2 256 = 2 292 . 

To attack more rounds, a more advanced search for the optimal set of bits to 
be guessed is likely to reduce the complexity of our attacks. For this, we used the 
same strategy as in the analysis of the Salsa20 and ChaCha stream ciphers [E2 ■ 
Namely, we computed the neutrality of each key bit (i.e., the probability that 
flipping the bit preserves the difference), and we chose to guess the bits that affect 
the bias the most, using some threshold on their neutrality. More precisely, we 
sort key bits according to their neutrality, then filter them with respect to some 
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threshold value. According to [E] ’s terminology, this corresponds to partitioning 
the key bits into “significant” and “non-significant” ones. 

Recall that in §iJOI we observed a bias at the 385-th bit after 4+17 rounds 
of Threefish-512. A key recovery attack on 21 + n rounds consists in guessing 
some key bits, inverting n rounds based on this guess, letting the other key bits 
be random, and observing a bias in that bit. Complexity is determined by the 
number of guessed bits and the value of the observed bias. 

Inverting four rounds with all key bits whose neutrality is greater than 0.29 
(we found 125 of those), we observe a bias 0.0365. Since some key bits are not 
guessed, and thus assumed random, some of the conditions to conform to the 
first round’s differential cannot be controlled. There are eight such additional 
conditions, which means that the 4-round initial differential will be followed 
with probability 2 -12-8 . Since our bias approximately equals to 2 -4 8 , and since 
we need to guess 512 — 125 key bits, the overall complexity of the attack on 
25-round Threefish-512 is about 2 12+8 x 2 2x4 ' 8 x 2 387 = 2 416 6 . Below we give 
the mask corresponding to the 125 non-guessed bits, for each key word: 

0000070060FFF836 0040030021FFFC0E 803C02F03FFFF83F 001001001603C006 

00780E30007F000E 0000000000000000 0000000000000000 007001800E03F801 . 

We can apply the same method on 26 rounds: with a neutrality threshold 0.17 
we obtain 30 “significant” key bits, and we observe a bias about 0.017 when 
all of them are random. The non-guessed bits give two additional conditions for 
the first 4-round differential. In total, the complexity of the attack is thus about 
2 12 +2 x 2 2x5.9 x 2 482 = 2 507 - 8 . Memory requirements are negligible. 

6 Boomerang Attacks 

Boomerang attacks were introduced by Wagner and first applied to block ci- 
phers m Roughly speaking, in boomerang attacks one uses two short differ- 
ential trails rather than a long one to exploit the efficiency of the former trails. 
Let E denote the encryption function of Threefish. View E as a cascade of four 
subciphers 

E = E u o E 1 o E@ o E a , (1) 

so that E is composed of a core E' = E 1 o E@ sandwiched by rounds E° and 
E u . The boomerang distinguisher is generally described for E' only, but for key 
recovery attacks on Threefish we need to generalize the attack to the construction 
in Eq. (jl} . 

Recall that in related-key attacks, one assumes that the attacker can query 
the cipher with other keys that have some specified relation with the original 
key. This relation is often an XOR-difference. A related-key differential is thus 
a triplet (An, Amt , As), associated with the probability 

Pr [E k (m) © E kBAk (m 0 A n ) = A„t] = P ■ 

k,m 

Here, An and A,ut are the input and output differences, A is the key difference, 
and p the probability of the differential. 
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For (related- key) boomerang attacks based on four related-keys, one exploits 
two short related-key differentials: (A? n , A% ut , A%) for E 9 , of probability p and 
(A^. A2 ut , A'l) for E 1 , of probability q. 

A distinguisher then works as follows: 

1. Pick a random plaintext mi and form m 2 = mi ® A? n . 

2. Obtain ci = E' k (mi) and C 2 = ( m 2 )- 

3. Set C 3 = ci © A~{ mt and 04 = 02 ® A"‘ mX . 

4. Obtain m 3 = and m 4 = 

5. Check m 3 © m 4 = A? n . 

For an ideal cipher, the final equality is expected to hold with probability 2~ n 
where n is the block length. The probability of the related-key boomerang distin- 
guisher, on the other hand, is approximately p 2 q 2 (see fl 7111 8111 1)11201 for details). 

Note that the boomerang attack can be generalized to exploit multiple differ- 
entials. The success probability then becomes p 2 q 2 , where p and q are the square 
roots of the sums of the squares of the differentials exploitecQ- 


6.1 Exploiting Nonlinear Differentials 

Differentials are often found via linearization, i.e., assuming that integer addi- 
tions behave as XOR’s. One then evaluates the probability of the differential 
with respect to the probability that each active addition behaves as XOR. This 
probability equals 2~ w , where w is the Hamming weight of the logical OR of the 
two difference masks, excluding the MSB. 

Yet one is not limited to such “linear” differentials, and the best differential — 
in terms of probability — is not necessarily a linearization, as illustrated by the 
work of Lipmaa and Moriai EH: for integer addition, they presented efficient 
algorithms for computing the probability of any differential, and for finding the 
optimal differential. The problem was later studied using formal rational series 
with linear representation EH- 

We used the algorithms in EH to find the differentials of our boomerang 
attacks. Note that it is not guaranteed that our trails are optimal, for the com- 
bination of local optimal differential trails (with respect to their probability) 
may contribute to a faster increase of the weight than (non-necessarily optimal) 
linear differentials. Yet our best differentials are not completely linear. 

6.2 Related-Key Distinguishers 

Like in our previous attacks, we exploit differences in the key and in the plain- 
text that vanish until the twelfth round (both for the forward and backward 
differentials). Then, we follow a nonlinear differential trail until the middle of 

1 Throughout the paper, our differentials do not make use of this multiple differential 
approach. One can further improve upon the differentials provided in this work by 

using this technique. 


Improved Cryptanalysis of Skein 553 


the cipher, i.e., between the 16-th and 17-th rounds. Our differential trail for E 3 
has probability p = 2 -86 , and the one for E 1 has probability 2 -113 , leading to a 
boomerang distinguisher on 34 rounds requiring about ( pq )~ 2 = 2 398 trials (see 
full version El)- Note that for the second part, MSB differences are set in the 
key words k 2 and fc.3, and in the tweak words to and t\ (thus giving no difference 
in the seventh subkey). 

6.3 Known-Related-Key Distinguishers 

Although the standard notion of distinguisher requires a secret (key), the notion 
of known-key distinguisher m is also relevant to set apart a block cipher from 
a randomly chosen permutation. Moreover, when a block cipher is used within a 
compression function, as Threefish is, known-key distinguishers may lead to dis- 
tinguishers for the hash function because all inputs are known to the adversary. 
If differences in the keys are used, we shall thus talk of known-related-key distin- 
guisher. An example of such distinguisher is the exhibition of input/output pairs 
that have some specific relation, as presented in m for seven rounds of AES- 
128. Here, we shall consider tuples (mi, m2, m3, m^, e\,C2, C3, C4) that satisfy the 
boomerang property. 

To build a known-related-key boomerang distinguisher on Threefish, we con- 
sider the decryption function, i.e., we start from the end of the cipher: when the 
key is known, the attacker can easily find a ciphertext that conforms to the first 
differential (e.g., to the weight-83 differential at round 35), which we could ver- 
ify experimentally. In other words, the final differential (including the differences 
caused by the final key) is “free” when launching the boomerang. When it returns, 
however, the 2 83 factor cannot be avoided if we want to exactly follow the differ- 
ential (which is not strictly necessary to run a distinguisher). We thus obtain a 
distinguisher on 35-round Threefish-512 with complexity 2 83 times that of the the 
related-key distinguisher on 34 rounds, that is, approximately 2 478 encryptions. 

Several tricks may be used to obtain a similar distinguisher at a reduced cost. 
For example, observing that the first and fourth (resp. second and third) MIX 
functions of round 34 depend only on the first and second (resp. third and fourth) 
MIX’s of round 35, one can speed-up the search for inputs conforming to the 
first two rounds of the boomerang. 


6.4 Extension to Key-Recovery 

We now show how to build a key-recovery attack on top of a boomerang distin- 
guisher for 32-round Threefish-512. We present some prehminary observations 
before describing and analyzing our attack. 

Using notations of Eq. ©: E 3 starts from the beginning and ends after the 
key addition in round 16, and E 7 starts from round 17 and ends just before the 
key addition after round 32. Our goal is to recover the last subkey. Restricted to 
32 rounds, the boomerang distinguisher has probabilities p = 2 -86 for E 3 and 
q = 2“ 37 for E 1 , yielding an overall boomerang probability of p 2 q 2 = 2 -246 . We 
now introduce some notions required to facilitate the analysis of our attack. 
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Definition 1 (CS-sequence). Let 6 be a 64-bit word of Hamming weight 0 < 
w < 64. The CS-sequence of 6 is 

S 4 = (|*o|,M,-- - ,K-i\) , 

where |sj| is the bit length of the i-th block of consecutive zeros in S finishing 
with a one. 

For example, for d = 1000010402000000 we have 

8 = 0001, 0000 0000 0000 0000 0001, 0000 01 00 0000 001 0 0000 • • • 0000 , 

and so the CS-sequence of 6 is Sg = (|so|, |si|, |« 2 1 , |s 3 1) = (4,20,6,9). 

The following result is extensively used in the key recovery attack using 
boomerang distinguisher, whose proof is provided in the full version of this 
paper 0. 

Theorem 1. The number of possible differences Ns after addition of difference 
8 with zero or A = 8000000000000000 difference modulo 2 64 can be directly 
computed from the CS-sequence of 8 as 

Ns = \so\ £ nw*. 

For instance, if 8 = 1000010402000000 then 
N s =4 £ (20 fcl X 6 fe2 x 9 fe3 ) 

(fci,fc 2 ,i=3)e{o,i} 3 

= 4 x (1 + 9 + 6 + (6 x 9) + 20 + (20 x 6) + (20 x 9) + (20 x 9 x 6)) 

= 4 x 1470 = 5880 . 

Applying Theorem G] we have the number of possible output differences caused 
by Af mt just after the key addition followed by the related-key boomerang dis- 
tinguisher for Threefish-512 is approximately 2 62 . We obtain this number by 
multiplying the number of possibilities for each word of the state (see Table 0) . 

Table 4. Number of possible output differences after the key addition in Threefish- 
512, for each word. Multiplying these numbers, we obtain in total approximately 2 62 
possible differences. 


V32,i 


n aZm 

«32,0 

(24, 15) 

384 

V32,l 

(32) 

32 

V32,2 

(0) 

1 

V32,3 

(4, 20, 6, 9) 

5880 

V32,4, 

(1) 

1 

V32,5 

(13,2,9,2,12,11,5) 

957840 

«32,6 

(13,11,30) 

4836 

«32,7 

(14) 

14 


Improved Cryptanalysis of Skein 555 


The Attack. Our attack works in three steps: in the first step, we obtain 
quartets satisfying the related-key boomerang relation; in the second, we recover 
the partial key by using the possible right quartets obtained from the first step; 
the last step is the brute force search of the rest of the key. The attack works as 
follows. 

1. Find right quartets 

for i= 1 ,..., 2 248 

• Generate a random unique pair of chosen plaintexts (m \ , rrif with an 
A? n difference and encrypt each plaintext with key k 1 and k 2 (having 
A% difference) respectively to obtain the corresponding ciphertexts 

( 4 : 4 )- 

• for j = 1 , . . . , 2 62 

o Set cf = c\ 8 A'f lt where Aff is set to the j-th possible differ- 
ence caused by zfy ut . 

o Decrypt cf with k 3 and obtain the plaintext rrif . 
o Store the values cf and rrif . 

• for k = 1 , . . . , 2 62 

o Set cf = cf, ® A!f t where A'f t is set to the fc-th possible differ- 
ence caused by Af t . 

o Decrypt cf' with k 4 and obtain the plaintext rrif . 
o Calculate M = rrif ® A? n and check whether M exists among 
the stored values of rrif . If this is the case, store the possible 
right quartet. 

• Free the memory allocated for the stored values of (possibly wrong) 
cf and mf . Increment i. 

2. Recover the partial key 

For each ciphertext word having a nonzero difference of a (possibly) right 
quartet (ci, C2, C3, cf) guess the corresponding output whitening key word 
ku>,i for l = 0,3, 5, 6, and check 

(ci,t - k u f ® (c 3 ,; - klf) = (c 2 ,; - klf) ® (c 4 ,; - k 4 f) = Af tl , 

where k 2 ; = ® Af . ; and k 3 ; = k 4 ; ® Af . ; . If this is the case, store 

this k u j. 

3. Recover the full key 

Run an exhaustive search of the remaining bits of the subkey. 

Complexity Analysis. The goal of step 1 is to find enough quartets satisfying 
the related-key boomerang trail. For each distinct 2 248 plaintext-ciphertext pairs 
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(mi, m 2 ) and ( 01 , 02 ), we correspondingly generate 2 62 new plaintext-ciphertext 
pairs (772,3,03) and (7714,04) by using the possible number of output differences 
given in Table 01 We know that a right quartet has to satisfy one of the possi- 
ble number of output differences A! out \ hence it is guaranteed to find the right 
quartet once it exists as we consider all possible combinations. Note that, in- 
creasing the number of quartets in that manner does not increase the number of 
right quartets, the reason simply being the newly generated plaintext-ciphertext 
pairs ( 7773 , 03 ) and ( 7774 , 04 ) can only have one root right plaintext-ciphertext 
pair (7771,7772) and (01,02). Therefore, the expected number of right quartets is 
2 248 . 2 - 246 = 2 2 . 0 n the other hand, we expect 2 372 • 2“ 512 = 2“ 140 additional 
false quartets. 

The first loop at step 1 requires 2 62 reduced round Threefish decryptions 
and approximately 2 70 5 bytes of memory. The second loop can be implemented 
independently and requires 2 62 reduced round Threefish decryptions and 2 62 
memory accesses. On the other hand, we need additional memory complexity 
of 2 69 5 bytes for storing Z\( mt values. Therefore, the overall complexity of the 
first step is bounded by 2 312 reduced round Threefish decryptions and about 2 71 
bytes of memory. Note that the memory requirement for the surviving quartets 
is negligible. 

Step 2 tries to recover the last subkey by using the quartets that passed the 
previous step. For each surviving quartet, we guess 64 bits of the final key at 
each word, decrypt one round and check the output difference Zl 7 ut l . As the 
computation at each word can be processed independently, the overall complexity 
of this step is dominated by the previous step. 

The probability that a false combination of quartets and key bits is counted in 
step 2 is upper bounded by 2~ 2wi where r uj[ is the minimum hamming weight of 
the corresponding output difference A!^ nt . Therefore, the right key is suggested 
4 + 2“ 140 ■ 2~ 2wi « 4 times by the right and additional false quartets. On the 
other hand, a wrong key is expected to be hit 4-2~ 2wi +2~ 140 -2~ 2wi « 2 -2 times. 
Note that this only holds for the words having an XOR difference of hamming 
weight two, for the rest the number of hits is strictly less than 2 -2 . We can use 
Poisson distribution to calculate the success rate of our attack. For an expected 
number of 2 -2 , the probability that a wrong key is suggested at most once is 
0.97. However, the probability that the right key is suggested more than once is 
more than 0.90. Therefore, we can find the right key or at least eliminate most 
of the keys with high probability. The complexity of the rest of the attack is 
dominated by the first step. 


7 Conclusion 

We applied a wide range of attack strategies to the core algorithm of Skein 
(the block cipher Threefish-512), culminating with a distinguisher on 35-round 
Threefish-512, and a key-recovery attack on 32 rounds. Other versions of Three- 
fish are vulnerable to similar attack strategies (for example, our related-key 
boomerang distinguisher works on up to 33 rounds of Threefish-256). To the 
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best of our knowledge, this is the first application of a key-recovery boomerang 
attack to an “ARX” algorithm, and also the first application of the boomerang 
technique to known-key distinguishers. 

Despite its relative simplicity, the full Threefish seems to resist state-of-the- 
art cryptanalytic techniques. Its balanced “ARX” structure combined with large 
words provides a good balance between diffusion and non-linearity, and avoids 
any particular structure exploitable by attackers. Using attacks on Threefish 
to attack the hash function Skein (or its compression function) seems difficult, 
because of the rather complex mode of operation of Skein. Although none of our 
attacks directly extends to the hash mode, the pseudorandomness of Threefish 
is required to validate the security proofs on Skein. Hence, 36 or more rounds of 
Threefish seem to be required to provide optimal security. 

Future works might apply the recent rebound attack m to Threefish, al- 
though it looks difficult to combine it with the trick discussed in U J3. 1 1 this 
forces the attacker to use specific differences. Another research direction relates 
to optimization of boomerang known- or chosen-key distinguishers. 
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A Conforming Pair for the 4-Round Differential 

When the key and the tweak are zero, the following two message blocks conform 
to the differential described in 

E979D16280002004 32B29AE900000000 D921590E00000000 5771CC9000000400 

A62FF22800000000 484B245000040080 D3BEA4E800008010 7A72784300000000 


A97 1917200100020 
A62FF22800040090 


72B2DAE980002004 

C84B245000000000 


DD61588E01000400 

D1BEA4E800000000 


5331CC1000000000 

FA72784300008010 
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B Examples of Near Collisions 

We provide an example of near collision on 459 bits for the reduced compression 
function of Skein’s UBI mode. Both inputs always have ho = ■ ■ ■ = = k-j = 0, 

and 

k 5 = C0DEC0DEC0DEC0DE. 

On the 16-round compression function, the first input has message block 

E9T9D16280002004 32B29AE900000000 D921590E00000000 5T71CC9000000400 

A62FF22800000000 484B245000040080 D3BEA4E800008010 7A72784300000000 

and 


k 6 = 6B9B2C1000000000 to = 3F213F213F213F22 t\ = 9464D3F000000000 
The second input has message block 

A971917200100020 72B2DAE980002004 DD61588E01000400 5331CC1000000000 
A62FF22800040090 C84B245000000000 D1BEA4E800000000 FA72784300008010 
and 


k 6 = 6B9B2C1000000000 to = BF213F213F213F22 ti = 9464D3F000000000 
The corresponding digests are respectively 

2A6DE9 1E3E8CDE3B BADAF451F59D3145 7C298A43FB73463F D8309C9E9E2594D5 

3543 1D226A2022E3 0EA42EB45F9EEEB9 DF038EECD6504300 588A798B1266D67A 

and 

6A65A80EBE9CFF1F FADAB450759D1141 78618AC3FA73463F 5C709C1A9E2590D5 

B543 1D226A242273 8EAE2FF45B9A6A39 5D038EECD650C310 D08E788B1266576A 
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Abstract. In this paper, an improved differential cryptanalysis framework for 
finding collisions in hash functions is provided. Its principle is based on lineariza- 
tion of compression functions in order to find low weight differential characteris- 
tics as initiated by Chabaud and Joux. This is formalized and refined however in 
several ways: for the problem of finding a conforming message pair whose differ- 
ential trail follows a linear trail, a condition function is introduced so that finding 
a collision is equivalent to finding a preimage of the zero vector under the con- 
dition function. Then, the dependency table concept shows how much influence 
every input bit of the condition function has on each output bit. Careful analysis 
of the dependency table reveals degrees of freedom that can be exploited in ac- 
celerated preimage reconstruction under the condition function. These concepts 
are applied to an in-depth collision analysis of reduced-round versions of the two 
SHA-3 candidates CubeHash and MD6, and are demonstrated to give by far the 
best currently known collision attacks on these SHA-3 candidates. 

Keywords: Hash functions, collisions, differential attack, SHA-3, CubeHash and 
MD6. 


1 Introduction 

Hash functions are important cryptographic primitives that find applications in many 
areas including digital signatures and commitment schemes. A hash function is a trans- 
formation which maps a variable-length input to a fixed-size output, called message 
digest. One expects a hash function to possess several security properties, one of which 
is collision resistance. Being collision resistant, informally means that it is hard to find 
two distinct inputs which map to the same output value. In practice, the hash functions 
are mostly built from a fixed input size compression function, e.g. the renowned Merkle- 
Damgard construction. To any hash function, no matter how it has been designed, we 
can always attribute fixed input size compression functions, such that a collision for 
a derived compression function results in a direct collision for the hash function itself. 
This way, firstly we are working with fixed input size compression functions rather than 
varying input size ones, secondly we can attribute compression functions to those hash 
functions which are not explicitly based on a fixed input size compression function, and 

* An extended version is available at http : //eprint . iacr . org/2009/382 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 560 (577] 2009. 
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thirdly we can derive different compression functions from a hash function. For exam- 
ple multi-block collision attack O benefits from the third point. Our task is to find two 
messages for an attributed compression function such that their digests are preferably 
equal (a collision) or differ in only a few bits (a near-collision). 

The goal of this work is to revisit collision-finding methods using linearization of 
the compression function in order to find differential characteristics for the compres- 
sion function. This method was initiated by Chabaud and Joux on SHA-0 ifTTTl and was 
later extended and applied to SHA-1 by Rijmen and Oswald E3i. The recent attack on 
EnRUPT by Indesteege and Preneel m is another application of the method. In par- 
ticular, in El it was observed that the codewords of a linear code, which are defined 
through a linearized version of the compression function, can be used to identify differ- 
ential paths leading to a collision for the compression function itself. This method was 
later extended by Pramstaller et al. E3 with the general conclusion that finding high 
probability differential paths is related to low weight codewords of the attributed linear 
code. In this paper we further investigate this issue. 

The first contribution of our work is to present a more concrete and tangible relation 
between the linearization and differential paths. In the case that modular addition is the 
only involved nonlinear operation, our results can be stated as follows. Given the parity 
check matrix Ti of a linear code, and two matrices A and B, find a codeword A such that 
AA V BA is of low weight. This is clearly different from the problem of finding a low 
weight codeword A. We then consider the problem of finding a conforming message 
pair for a given differential trail for a certain linear approximation of the compression 
function. We show that the problem of finding conforming pairs can be reformulated as 
finding preimages of zero under a function which we call the condition function. We 
then define the concept of dependency table which shows how much influence every 
input bit of the condition function has on each output bit. By carefully analyzing the 
dependency table, we are able to profit not only from neutral bits Q but also from 
probabilistic neutral bits Q in a backtracking search algorithm, similar to isiEiim 
This contributes to a better understanding of freedom degrees uses. 

We consider compression functions working with n-bit words. In particular, we fo- 
cus on those using modular addition of n-bit words as the only nonlinear operation. The 
incorporated linear operations are XOR, shift and rotation of n-bit words in practice. 
We present our framework in detail for these constructions by approximating modular 
addition with XOR. We demonstrate its validity by applying it on reduced-round vari- 
ants of CubeHash 0 (one of the NIST SHA-3 K2ll competitors) which uses addition, 
XOR and rotation. CubeHash instances are parametrized by two parameters r and b 
and are denoted by CubeHash-r/6 which process b message bytes per iteration; each 
iteration is composed of r rounds. Although we can not break the original submission 
CubeHash-8/1, we provide real collisions for the much weaker variants CubeHash- 
3/64 and CubeHash-4/48. Interestingly, we show that neither the more secure variants 
CubeHash-6/16 and CubeHash-7/64 do provide the desired collision security for 
512-bit digests by providing theoretical attacks with complexities 2 222 ' 6 and 2 203 0 re- 
spectively; nor that CubeHash-6/4 with 512-bit digests is second-preimage resistant, 
as with probability 2“ 478 a second preimage can be produced by only one hash evalua- 
tion. Our theory can be easily generalized to arbitrary nonlinear operations. We discuss 
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this issue and as an application we provide collision attacks on 16 rounds of MD6 am 
MD6 is another SHA-3 candidate whose original number of rounds varies from 80 to 
168 when the digest size ranges from 160 to 512 bits. 

2 Linear Differential Cryptanalysis 

Let’s consider a compression function H = Compress(M, V) which works with n-bit 
words and maps an m-bit message M and a n-bit initial value V into an fo-bit output H. 
Our aim is to find a collision for such compression functions with a randomly given ini- 
tial value V. In this section we consider modular-addition-based Compress functions, 
that is, they use only modular additions in addition to linear transformations. This in- 
cludes the family of AXR (Addition-XOR-Rotation) hash functions which are based 
on these three operations. In Sectional we generalize our framework to other family of 
compression functions. For these Compress functions, we are looking for two messages 
with a difference A that result in a collision. In particular we are interested in a A for 
which two randomly chosen messages with this difference lead to a collision with a high 
probability for a randomly chosen initial value. For modular-addition-based Compress 
functions, we consider a linearized version for which all additions are replaced by XOR. 
This is a common linear approximation of addition. Other possible linear approxima- 
tions of modular addition, which are less addressed in literature, can be considered ac- 
cording to our generalization of Sectional As addition was the only nonlinear operation, 
we now have a linear function which we call Compress lin . Since Compress lin (M, V ) ® 
Compress lin (M © A, V) = Compress lin (Zi,0) is independent of the value of V, we 
adopt the notation Compress lin (M) = Compress lin (M, 0) instead. Let A be an el- 
ement of the kernel of the linearized compression function, i.e. Compress lin (/A) = 0. 
We are interested in the probability Pr{ Com press(M, V) CD Compress(M © A, V ) = 0} 
for a random M and V. In the following we present an algorithm which computes this 
probability, called the raw (or bulk ) probability. 

2.1 Computing the Raw Probability 

We consider a general n-bit vector x = (a;o, . . . , x n -i) as an n-bit integer denoted by 
the same variable, i.e. x = Y%= o *i2‘- The Hamming weight of a binary vector or an 
integer x, wt(a;), is the number of its nonzero elements, i.e. wt(x) = Y^i=o Xi - We 
use + for modular addition of words and ®, V and A for bit-wise XOR, OR and AND 
logical operations between words as well as vectors. We use the following lemma which 
is a special case of the problem of computing Pr{ ((A® a) + (B ® (Sj) ® (A + B) = 7} 
where a, jd and 7 are constants and A and B are independent and uniform random 
variables, all of them being n-bit words. Lipmaa and Moriai have presented an efficient 
algorithm for computing this probability E3. We are interested in the case 7 = a® jd 
for which the desired probability has a simple closed form. 

Lemma 1. Pr{((A® a) + {B ® /?)) ffi (A + B) = a®/?} = 2 - wt ((“ v / 3 ) A ( 2n ' 1 - 1 )). 

Lemma Q] gives us the probability that modular addition behaves like the XOR op- 
eration. As Compress lin approximates Compress by replacing modular addition with 


Linearization Framework for Collision Attacks 563 


XOR, we can then devise a simple algorithm to compute (estimate) the raw probability 
Pr{Compress(M, V)©Compress(Af 0 A. V) = Compress lin (A)}. Let’s first introduce 
some notation. 

Notation. Let n a dd denote the number of additions which Compress uses in total. In 
the course of evaluation of Compress(M, V), let the two addends of the ?'-th addition 
(1 < i < n a dd) be denoted by A’(M, V) and B'(M, V), for which the ordering is not 
important. The value C l (M , V) = (A\M, V) + B’(M, V )) 0 A l (M. V ) 0 B l (M, V) 
is then called the carry word of the i-th addition. Similarly, in the course of evaluation 
of Compress lin (A), denote the two inputs of the i-th linearized addition by a 1 ( A) and 
/3 l (A) in which the ordering is the same as that for A L and B 1 . We define five more 
functions A (M, V), B(M, V), C(M, V), a(A) and /3(A) with (n - l)n a dd-bit out- 
puts. These functions are defined as the concatenation of all the n a dd relevant words 
excluding their MSBs. For example A (M, V) and a(A) are respectively the concate- 
nation of the n a dd words (A 1 (M, V), . . . . A” add (M, U)) and {of (A) , . . . , a” add (A)) 
excluding the MSBs. 

Using this notation, the raw probability can be simply estimated as follows. 

Lemma 2. Let Compress be a modular-addition-based compression function. Then for 
any message difference A and for random values M and V, pa = is 

a lower bound for Pr{Compress(M, V ) 0 Compress(M0 A, V) = Compress lin (A)}. 

Proof We start with the following definition. 

Definition 1. We say that a message M (for a given V) conforms to (or follows) the 
trail of A m ' 

((A* 0 of) + (B l 0 ft)) 0 (A 4 + BP) — of 0 for 1 < ( < n add , (1) 

where A®, B l , a % and (3 l are shortened forms for A l (M,V), B l (M. V), a® (A) and 
(f( A), respectively. 

It is not difficult to prove that under some reasonable independence assumptions pa, 
which we call conforming probability, is the probability that a random message M 
follows the trail of A. This is a direct corollary of Lemma[I]and Definition!!]. The exact 
proof can be done by induction on n ac jd, the number of additions in the compression 
function. Due to other possible non-conforming pairs that start from message difference 
A and lead to output difference Compress lin (A), pa is a lower bound for the desired 
probability in the lemma. □ 

If Compress lin (A) is of low Hamming weight, we get a near collision in the output. The 
interesting A’s for collision search are those which belong to the kernel of Compress lin , 
i.e. those that satisfy Compress lin (A) = 0. From now on, we assume that A / 0 
is in the kernel of Compress lin , hence looking for collisions. According to LemmaQ 
one needs to try around 1 /pa random message pairs in order to find a collision which 
conforms to the trail of A. However in a random search it is better not to restrict oneself 
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to the conforming messages as a collision at the end is all we want. Since pa is a lower 
bound for the probability of getting a collision for a message pair with difference A, 
we might get a collision sooner. In Section 0 we explain a method which might find a 
conforming message by avoiding random search. 

2.2 Link with Coding Theory 

We would like to conclude this section with a note on the relation between the fol- 
lowing two problems: (I) finding low-weight codewords of a linear code, (II) finding a 
high probability linear differential path. Since the functions Compress lin (A), a(A) 
and /3(A) are linear, we consider A as a column vector and attribute three matri- 
ces H, A and B to these three transformations, respectively. In other words we have 
Compress lin (A) = HA, a(A) = AA and /3(A) = BA. We then call H the parity 
check matrix of the compression function. 

Based on an initial work by Chabaud and Joux mi the link between these two 
problems has been discussed by Rijmen and Oswald in and by Pramstaller et al. 
in E51 with the general conclusion that finding highly probable differential paths is re- 
lated to low weight codewords of the attributed linear code. In fact the relation between 
these two problems is more delicate. For problem (I), we are provided with the parity 
check matrix H of a linear code for which a codeword A satisfies the relation HA = 0. 
Then, we are supposed to find a low-weight nonzero codeword A. This problem is be- 
lieved to be hard and there are some heuristic approaches for it, see urn for example. 
For problem (II), however, we are given three matrices H, A and B and need to find a 
nonzero A such that HA = 0 and AA V BA is of low- weight, see Lemma 0 Never- 
theless, low-weight codewords A’s matrix H might be good candidates for providing 
low- weight AA \JBA,i.e. differential paths with high probability pa ■ In particular, this 
approach is promising if these three matrices are sparse. 

3 Finding a Conforming Message Pair Efficiently 

The methods that are used to accelerate the finding of a message which satisfies some 
requirements are referred to as freedom degrees use in the literature. This includes 
message modifications G2, neutral bits 0, boomerang attacks EMI, tunnels llT£l 
and submarine modifications lETi . In this section we show that the problem of finding 
conforming message pairs can be reformulated as finding preimages of zero under a 
function which we call the condition function. One can carefully analyze the condition 
function to see how freedom degrees might be used in efficient preimage reconstruc- 
tion. Our method is based on measuring the amount of influence which every input bit 
has on each output bit of the condition function. We introduce the dependency tables to 
distinguish the influential bits, from those which have no influence or are less influen- 
tial. In other words, in case the condition function does not mix its input bits well, we 
profit not only from neutral bits O but also from probabilistic neutral bits Q. This is 
achieved by devising a backtracking search algorithm, similar to E1IZ31II1, based on 
the dependency table. 
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3.1 Condition Function 

Let’s assume that we have a differential path for the message difference A which holds 
with probability pa = 2 _J/ . According to Lemma El we have y = wt (a(A) V /3(A)) . 
In this section we show that, given an initial value V, the problem of finding a con- 
forming message pair such that Compress(M, V) © Compress(M © A. V) =0 can 
be translated into finding a message M such that Condition^©/, V ) = 0. Here Y = 
Conditions (M, V) is a function which maps m-bit message M and u-bit initial value 
V into y-bit output Y . In other words, the problem is reduced to finding a preimage of 
zero under the Conditions function. As we will see it is quite probable that not every 
output bit of the Condition function depends on all the message input bits. By taking a 
good strategy, this property enables us to find the preimages under this function more 
efficiently than random search. But of course, we are only interested in preimages of 
zero. In order to explain how we derive the function Condition from Compress we first 
present a quite easy-to-prove lemma. We recall that the carry word of two words A and 
B is defined as C = (A + B) © A © B. 

Lemma 3. Let A and B be two n-bit words and C represent their carry word. Let 
S = 2 i for0 <i<n- 2. Then, 

((A ©$) + (£ ©5)) = (A + B)^ Ai®Bi®l=0 , (2) 

(A + (B®6)) = (A + B)®6^Ai®Ci = 0, (3) 

and similarly 

({A®5) + B) = (A + B)®6& Bi®Ci = 0. (4) 

For a given difference A, a message M and an initial value V, let A/., LL, Cfc, a k and 
P k , 0 < k < (n — l)« ac id, respectively denote the fc-th bit of the output vectors of the 
functions A (M, V ), B(M, V), C (M, V ), a(A) and (3(A ) , as defined in Section I7~T1 
Let {*o, • • • , i y - 1 }, 0 < io < i\ < ■ ■ • < i y - 1 < (n — l)n a dd be the positions of l’s in 
the vector a V /3. We define the function Y = Condition^ (M, V) as: 

( A;. © B ia . © 1 if (tXij , Pij) = (1, 1), 

Yj = \ A tj ® C H if ( aij , ) = (0, 1) , (5) 

[ © Cj 3 . if = (1,0), 

for j = 0,1 .... ,y — 1 . This equation can be equivalently written as equation Q). 

Proposition 1. For a given V and A, a message M conforms to the trail of A iff 
Condition^(M, V) = 0. 

3.2 Dependency Table for Freedom Degrees Use 

For simplicity and generality, let’s adopt the notation F(M, V ) = Condition^ (AL V) 
in this section. Assume that we are given a general function Y = F(M , V) which maps 
to message bits and v initial value bits into y output bits. Our goal is to reconstruct 
preimages of a particular output, for example the zero vector, efficiently. More precisely, 
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we want to find V and M such that F(M. V) = 0. If F mixes its input bits very well, 
one needs to try about 2 y random inputs in order to find one mapping to zero. However, 
in some special cases, not every input bit of F affects every output bit. Consider an ideal 
situation where message bits and output bits can be divided into £ and l+l disjoint 
subsets respectively as |jf=i -Mi ar| d Ui=o ^ such that the output bits Tj (0 < j < £) 
only depend on the input bits Ui=i -Mi and the initial value V. In other words, once 
we know the initial value V, we can determine the output part To- If we know the 
initial value V and the input portion M. i, the output part Ti is then known and so 
on. Refer to Section 0 to see the partitioning of a condition function related to MD6. 
This property of F suggests Algorithm Q] for finding a preimage of zero. Algorithm Q] 
is a backtracking search algorithm in essence, similar to 0E3IEI, and in practice 
is implemented recursively with a tree-based search to avoid memory requirements. 
The values qo, qi, . . . , qe are the parameters of the algorithm to be determined later. To 
discuss the complexity of the algorithm, let \M i\ and T; denote the cardinality of M t 
and y r respectively, where |To| > 0 and |Ti| > 1 for 1 < * < t. We consider an ideal 
behavior of F for which each output part depends in a complex way on all the variables 
that it depends on. Thus, the output segment changes independently and uniformly at 
random if we change any part of the relevant input bits. 


Algorithm 1. Preimage finding 

Require: qo,qi, ■ ■ ■ ,qe 

Ensure: some preimage of zero under F 

0: Choose 2 qo initial values at random and keep those 2' 1 ' candidates which make To part null. 
1: For each candidate, choose 2 qi ~ Ql values for Adi and keep those 2 1,2 ones making Ti null. 
2: For each candidate, choose 2 q2 ~ q2 values for M 2 and keep those 2 9,3 ones making T 2 null. 

v. For each candidate, choose 2 ?i ~ 9i values for Mi and keep those 2®*+ I ones making T; null. 

t. For each candidate, choose 2 qi ~ qt values for Me. and keep those 2 q >+ 1 final candidates 
making Tf null. 


To analyze the algorithm, we need to compute the optimal values for 50, • • • , <U- The 
time complexity of the algorithm is J2i = 0 as at eac h step 2 qi values are examined. 
The algorithm is successful if we have at least one candidate left at the end, i.e. q' f+1 > 
0. We have q' i+1 ^ q%— |Ti|, coming from the fact that at the i-th step 2 qi values are 
examined each of which makes the portion T» of the output null with probability 2~ I . 
Note that we have the restrictions q^ — q[ < \M.i\ and 0 < q[ since we have \M,\ bits 
of freedom degree at the i-th step and we require at least one surviving candidate after 
each step. Hence, the optimal values for q, + can be recursively computed as g,; _ 1 = 
\y t , + max(0, qi - \Mi\) for i = £,£- 1, . . . , 1 with q t = |3^|. 

How can we determine the partitions Mi and y\ for a given function F2 We pro- 
pose the following heuristic method for determining the message and output partitions 
in practice. We first construct a y x rn binary valued table T called dependency table. 
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The entry Tij, 0 < i < m — 1 and 0 < j < y — 1, is set to one iff the j-th output bit 
is highly affected by the i - th message bit. To this end we empirically measure the prob- 
ability that changing the i-th message bit changes the j-th output bit. The probability 
is computed over random initial values and messages. We then set T^j to one iff this 
probability is greater than a threshold 0 < th < 0.5, for example th = 0.3. We then 
call Algorithm^ 


Algorithm 2. Message and output partitioning 
Require: Dependency table T 

Ensure: £, message partitions Mi, . . . , Me and output partitions To, • • • , T/>. 

1 : Put all the output bits j in To for which the row j of T is all-zero. 

2: Delete all the all-zero rows from T. 

3: i := 0; 

4: while T is not empty do 
5: £-r I : 

6: repeat 

7: Determine the column i in T which has the highest number of l’s and delete it from T. 

8: Put the message bit which corresponds to the deleted column i into the set Me. 

9: until There is at least one all-zero row in T OR T becomes empty 

10: If T is empty set Tr to those output bits which are not in IJi=o and stop. 

11: Put all the output bits j in T? for which the corresponding row of T is all- zero. 

12: Delete all the all-zero rows from T. 

13: end while 


In practice, once we make a partitioning for a given function using the above method, 
there are two issues which may cause the ideal behavior assumption to be violated: 

1 . The message segments M \ , . . . , M i do not have full influence on y iy 

2. The message segments M j+i , .Me have influence on To ; ■ ■ • , • 

With regard to the first issue, we ideally would like that all the message segments 
Mi,M 2 ,...,Mi as well as the initial value V have full influence on the output part 
y,. In practice the effect of the last few message segments Mi-di ,Mi (for some 
small integer di ) is more important, though. Theoretical analysis of deviation from this 
requirement may not be easy. However, with some tweaks on the tree-based (back- 
tracking) search algorithm, we may overcome this effect in practice. For example if the 
message segment Mi-i does not have a great influence on the output segment y\, we 
may decide to backtrack two steps at depth i, instead of one (the default value). The 
reason is as follows. Imagine that you are at depth i of the tree and you are trying to 
adjust the i-th message segment Mi, to make the output segment » null. If after trying 
about choices for the /'-th message block, you do not find an appropriate 

one, you will go one step backward and choose another choice for the (i— l)-st message 
segment Afi_i ; you will then go one step forward once you have successfully adjusted 
the ( i — l)-st message segment. If Mi - 1 has no effect on T), this would be useless and 
increase our search cost at this node. Hence it would be appropriate if we backtrack 
two steps at this depth. In general, we may tweak our tree-based search by setting the 
number of steps which we want to backtrack at each depth. 
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In contrast, the theoretical analysis of the second issue is easy. Ideally, we would 
like that the message segments Mi, . . . ,Mi have no influence on the output seg- 
ments 34, • ■ • , 34-i- The smaller the threshold value th is chosen, the less the in- 
fluence would be. Let 2~ Pi , 1 < i < £, denote the probability that changing the 
message segment Mi does not change any bit from the output segments 34, • ■ ■ , 34- 1- 
The probability is computed over random initial values and messages, and a random 
non-zero difference in the message segment Mi. Algorithm Q] must be reanalyzed in 
order to recompute the optimal values for qo,...,qt. Algorithm Q] also needs to be 
slightly changed by reassuring that at step i, all the output segments 34 , • • ■ , 34- 1 re- 
main null. The time complexity of the algorithm is still JA =0 2 qi and it is successful 
if at least one surviving candidate is left at the end, i.e. qi-+i > 0. However, here we 
set q' i+1 as q t — |34| ~ Pi- This comes from the fact that at the i-th step 2 qi values are 
examined each of which makes the portion 34 of the output null with probability 
and keeping the previously set output segments 34 , • ■ • , 34- 1 null with probability 2~ Pi 
(we assume these two events are independent). Here, our restrictions are again 0 < q[ 
and qi — q[< \Mi\. Hence, the optimal values for q^s can be recursively computed as 
qi-i = Pi-i + |34— 1 1 + max(0, qi — \Mi\) for i — £,£— 1, . . . , 1 with qi = |34|- 

Remark 1. When working with functions with a huge number of input bits, it might be 
appropriate to consider the m-bit message M as a string of u-bit units instead of bits. 
For example one can take u = 8 and work with bytes. We then use the notation M = 
(M[0], . . . , M[m/u— 1]) (assuming u divides m) where M[i] = ( Mj„ , . . . , 

In this case the dependency table must be constructed according to the probability that 
changing every message unit changes each output bit. 

4 Application to CubeHash 

CubeHash 0J is Bernstein’s proposal for the NIST SHA-3 competition fn ii . CubeHash 
variants, denoted by CubeHash-r/6, are parametrized by r and b which at each iter- 
ation process b bytes in r rounds. Although CubeHash-8/1 was the original official 
submission, later the designer proposed the tweak CubeHash-16/32 which is almost 
16 times faster than the initial proposal Q. Nevertheless, the author has encouraged 
cryptanalysis of CubeHash-r/6 variants for smaller r’s and bigger b’s. 

4.1 CubeHash Description 

CubeHash works with 32-bit words (n = 32) and uses three simple operations: XOR, 
rotation and modular addition. It has an internal state S = (So, Si, . . . , S31) of 32 
words and its variants, denoted by CubeHash-r/6, are identified by two parameters 
r £ {1, 2, . . . } and b £ {1,2,..., 128}. The internal state S is set to a specified value 
which depends on the digest length (limited to 512 bits) and parameters r and b. The 
message to be hashed is appropriately padded and divided into 6-byte message blocks. 
At each iteration one message block is processed as follows. The 32-word internal state 

5 is considered as a 128-byte value and the message block is XORed into the first 6 
bytes of the internal state. Then, the following fixed permutation is applied r times to 
the internal state to prepare it for the next iteration. 
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1. Add 5, into for 0 < i < 15. 

2. Rotate Si to the left by seven bits, for 0 < i < 15. 

3. Swap Si and 5j©s. for 0 < i < 7. 

4. XOR 5,016 into Si, for 0 < i < 15. 

5. Swap Si and 5 ie2 , for i e {16, 17, 20, 21, 24, 25, 28, 29}. 

6. Add Si into 5*016, for 0 < i < 15. 

7. Rotate Si to the left by eleven bits, for 0 < i < 15. 

8. Swap Si and 5 ie4 , for i e {0,1,2,3,8,9,10,11}. 

9. XOR 5,016 into S it for 0 < i < 15. 

10. Swap Si and 5* el , for i e {16, 18, 20, 22, 24, 26, 28, 30}. 

Having processed all message blocks, a fixed transformation is applied to the final in- 
ternal state to extract the hash value as follows. First, the last state word 531 is ORed 
with integer 1 and then the above permutation is applied 10 x r times to the resulting 
internal state. Finally, the internal state is truncated to produce the message digest of 
desired hash length. Refer to 0 for the full specification. 

4.2 Definition of the Compression Function Compress 

To be in the line of our general method, we need to deal with fixed-size input com- 
pression functions. To this end, we consider t (t > 1) consecutive iterations of Cube- 
Hash. We define the function H = Compress(M, V) with an 857-bit message M = 
M° || ... 1 1 M t_1 , a 1024-bit initial value V and a (1024 — 86) -bit output H. The initial 
value V is used to initialize the 32-word internal state of CubeHash. Each M l is a 6-byte 
message block. We start from the initialized internal state and update it in t iterations. 
That is, in t iterations the t message blocks M°, . . . , M t_1 are sequentially processed 
in order to transform the internal state into a final value. The output H is then the last 
128 — 6 bytes of the final internal state value which is ready to absorb the (t + l)-st 
message block (the 32-word internal state is interpreted as a 128-byte vector). 

Our goal is to find collisions for this Compress function. In the next section we 
explain how collisions can be constructed for CubeHash itself. 

4.3 Collision Construction 

We are planning to construct collision pairs ( M ' , M") for CubeHash-r/6 which are of 
the form M' = MP re ||M||M t ||M suf andM" = MP re ||M©Zi||M t ©21*||M suf . Here, 
M pre is the common prefix of the colliding pairs whose length in bytes is a multiple of 
6, M* is one message block of 6 bytes and M suf is the common suffix of the colliding 
pairs whose length is arbitrary. The message prefix M pre is chosen for randomizing the 
initial value V. More precisely, V is the content of the internal state after processing 
the message prefix M pre . For this value of V, ( M , M © A) is a collision pair for the 
compression function, i.e. Compress(M, V) = Compress(M © A V). Remember that 
a collision for the Compress indicates collision over the last 128 — 6 bytes of the internal 
state. The message blocks M t and M l © A f are used to get rid of the difference in the 
first 6 bytes of the internal state. The difference A 1 is called the erasing block difference 
and is computed as follows. When we evaluate the Compress with inputs (M, V ) and 
(M © A, V), A* is the difference in the first 6 bytes of the final internal state values. 
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Once we find message prefix M pre , message M and difference A, any message pairs 
(M' . M") of the above-mentioned form is a collision for CubeHash for any message 
block M* and any message suffix M sui . We find the difference A using the linearization 
method of Section 0 to applied to CubeHash in the next section. Then, M pre and M 
are found by finding a preimage of zero under the Condition function as explained in 
Section 0 Algorithm 4 in the extended version of this article @ shows how CubeHash 
Condition function can be implemented in practice for a given differential path. 

4.4 Linear Differentials for CubeHash-r/6 

As we explained in Section0 the linear transformation Compress lin can be identified by 
a matrix TChxm- We are interested in Z\’s such that HA = 0 and such that the differential 
trails have high probability. For CubeHash-r/6 with / iterations, A = A 0 1 1 . . . | |Z\ t_1 
and H has size (1024 — 86) x 86/, see Section H"21 This matrix suffers from having low 
rank. This enables us to find low weight vectors of the kernel. We then hope that they 
are also good candidates for providing highly probable trails, see Section I2~21 Assume 
that this matrix has rank (86/ — r), r > 0, signifying existence of 2 T — 1 nonzero 
solutions to HA = 0. To find a low weight nonzero A, we use the following method. 

The rank of H being ( 86 / — r) shows that the solutions can be expressed by iden- 
tifying r variables as free and expressing the rest in terms of them. Any choice for the 
free variables uniquely determines the remaining 86 / — r variables, hence providing a 
unique member of the kernel. We choose a set of r free variables at random. Then, we 
set one, two, or three of the r free variables to bit value 1, and the other r — 1 , or r — 2 or 
r — 3 variables to bit value 0 with the hope to get a A providing a high probability dif- 
ferential path. We have made exhaustive search over all r + ( 2 ) + ( 3 ) possible choices 
for all 6 e (1, 2, 3, 4, 8, 16, 32, 48, 64} and r G {1, 2, 3, 4, 5, 6, 7, 8 } in order to find 
the best characteristics. Table [I] includes the ordered pair (/, y), i.e. the corresponding 
number of iterations and the — log 2 probability (number of bit conditions) of the best 
raw probability path we found. For most of the cases, the best characteristic belongs to 
the minimum value of / for which r > 0. There are a few exceptions to consider which 
are starred in Table0 For example in the CubeHash-3/4 case, while for / = 2 we have 
t = 4 and y = 675, by increasing the number of iterations to / = 4, we get t = 40 
and a better characteristic with y = 478. This may hold for other cases as well since we 
only increased / until our program terminated in a reasonable time. We would like to 
emphasize that since we are using linear differentials, the erasing block difference A* 
only depends on the difference A, see Section PP1 

Table 1. The values of (/, y) for the differential path with the best found raw probability 
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Second preimage attacks on CubeHash. Any differential path with raw probabil- 
ity greater than 2~ 512 can be considered as a (theoretical) second preimage attack on 
CubeHash with 512-bit digest size. In Table0the entries which do not correspond to a 
successful second preimage attack, i.e. y > 512, are shown in gray, whereas the others 
have been highlighted. For example, our differential path for CubeHash-6/4 with raw 
probability 2 -478 indicates that by only one hash evaluation we can produce a second 
preimage with probability 2 -478 . Alternatively, it can be stated that for a fraction of 
2—478 messa g es we can easily provide a second preimage. The list of differential trails 
for highlighted entries can be found in the extended version m. 

4.5 Collision Attacks on CubeHash Variants 

Although Table 0 includes our best found differential paths with respect to raw proba- 
bility or equivalently second preimage attack, when it comes to freedom degrees use for 
collision attack, these trails might not be the optimal ones. In other words, for a specific 
r and b, there might be another differential path which is worse in terms of raw prob- 
ability but is better regarding the collision attack complexity if we use some freedom 
degrees speedup. As an example, for CubeHash-3/48 with the path which has raw 
probability 2 -364 , using our method of Section 0 the time complexity can be reduced 
to about 2 58 9 (partial) evaluation of its condition function. However, there is another 
path with raw probability 2 -368 which has time complexity of about 2 53 ,3 (partial) eval- 
uation of its condition function. Table 0 shows the best paths we found regarding the 
reduced complexity of the collision attack using our method of Section 0 While most 
of the paths are still the optimal ones with respect to the raw probability, the starred en- 
tries indicate the ones which invalidate this property. Some of the interesting differential 
paths for starred entries in Table 0 are given in the extended version [0. 

Table 0 shows the reduced time complexities of collision attack using our method of 
Section 0 for the differential paths of Table 0 To construct the dependency table, we 
have analyzed the Condition function at byte level, see Remark0 The time complexities 
are in logarithm 2 basis and might be improved if the dependency table is analyzed at a 
bit level instead. The complexity unit is (partial) evaluation of their respective Condition 
function. We remind that the full evaluation of a Condition function corresponding to a 
t-iteration differential path is almost the same as application of t iterations ( rt rounds) 
of CubeHash. We emphasize that the complexities are independent of digest size. All 
the complexities which are less than 2 C / 2 can be considered as a successful collision 
attack if the hash size is bigger than c bits. The complexities bigger than 2 256 have been 
shown in gray as they are worse than birthday attack, considering 512-bit digest size. 
The successfully attacked instances have been highlighted. 

The astute reader should realize that the complexities of Table 0 correspond to the 
optimal threshold value, see Section 13.21 Refer to the extended version m to see the 
effect of the threshold value on the complexity. 

Practice versus theory. We provided a framework which is handy in order to analyze 
many hash functions in a generic way. In practice, the optimal threshold value may 
be a little different from the theoretical one. Moreover, by slightly playing with the 
neighboring bits in the suggested partitioning corresponding to a given threshold value 
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Table 2. The values of (t, y) for the differential path with the best found total complexity (Table0 
includes the reduced complexities using our method of Section^ 


Eg 1 I 2 I I I 4 | 8 | 12 | 16 | 32 | 48 | 64 ] 


1 

(14,1225) 

(8,221) 

(4,46) 

(4, 32) 

(4, 32) 

- 




- 

2 

(7, 1225) 

(4,221) 

(2,46) 

(2,32) 

(2,32) 

- 



- 


3 

(16,4238) 

(6,1881) 

(4, 798) 

(4,478) 

(4,478) 

(4,400) 

(4,400) 

(4,400) 

(3,368)* 

(2,65) 

4 

(8, 2614) 

(3, 964) 

(2, 195) 

(2, 189) 

(2, 189) 

(2, 156) 

(2, 156) 

(2,156) 

(2, 134)* 

(2, 134)* 

5 

(18, 10221) 

(8,4579) 

(4, 2433) 

(4, 1517) 

(4, 1517) 

(4,1250)’ 

(4, 1250)* 

(4, 1250)* 

(4, 1250)* 

(2, 205) 

6 

(10,4238) 

(3,1881) 

(2, 798) 

(2,478) 

(2, 478) 

(2,400) 

(2,400) 

(2,400) 

(2,351) 

(2,351) 

7 

(14,13365) 

(8,5820) 

(4, 3028) 

(4, 2124) 

(4, 2124) 

(4, 1748) 

(4, 1748) 

( 4 - 1748) 

(4, 1748) 

(2,455)’ 

8 

(4,2614) 

(4,2614) 

(2, 1022) 

(2, 1009) 

(2, 1009) 


(2,830) 

(2,830) 

(2,655)* 

(2,655)* 


(Algorithm El, we may achieve a partitioning which is more suitable for applying the 
attacks. In particular, Table 0 contains the theoretical complexities for different Cube- 
Hash instances under the assumption that the Condition function behaves ideally with 
respect to the first issue discussed in Section 15.21 In practice, deviation from this as- 
sumption increases the effective complexity. For particular instances, more simulations 
need to be done to analyze the potential non-randomness effects in order to give a more 
exact estimation of the practical complexity. 

According to Section EP1 for a given linear difference A, we need to find message 
prefix M pre and conforming message M for collision construction. Our backtracking 
(tree-based) search implementation of Algorithm0for CubeHash-3/64 finds M pre and 
M in 2 21 (median complexity) instead of the 2 9 4 of Tab I e0 The median decreases to 2 17 
by backtracking three steps at each depth instead of one, see Section FO For CubeHash- 
4/48 we achieve the median complexity 2 30 4 which is very close to the theoretical value 
230.7 0 f j a t)| e [2J Collision examples for CubeHash-3/64 and CubeHash-4/48 can be 
found in the extended paper 0 . Our detailed analysis of CubeHash variants shows that 
the practical complexities for all of them except 3-round CubeHash are very close to 
the theoretical values of Table 0 We expect the practical complexities for CubeHash 
instances with three rounds to be slightly bigger than the given theoretical numbers. For 
detailed comments we refer to the extended paper 0 . 

Comparison with the previous results. The first analysis of CubeHash was proposed 
by Aumasson et al. m in which the authors showed some non-random properties for 
several versions of CubeHash. A series of collision attacks on CubeHash-1/6 and 
CubeHash-2/6 for large values of b were announced by Aumasson Q and Dai 111 21 . 


Table 3. Theoretical log 2 complexities of improved collision attacks with freedom degrees use at 
byte level for the differential paths of Table 0 
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Collision attacks were later investigated deeply by Brier and Peyrin ®. Our results 
improve on all existing ones as well as attacking some untouched variants. 

5 Generalization 

In sections0and0we considered modular-addition-based compression functions which 
use only modular additions and linear transformations. Moreover, we concentrated on 
XOR approximation of modular additions in order to linearize the compression func- 
tion. This method is however quite general and can be applied to a broad class of hash 
constructions, covering many of the existing hash functions. Additionally, it lets us 
consider other linear approximations as well. We view a compression function H = 
Compress (M,V) : {0, l} m x{0. 1} V — * {0,1}^ as a binary finite state machine (FSM). 
The FSM has an internal state which is consecutively updated using message M and 
initial value V. We assume that FSM operates as follows, and we refer to such Compress 
functions as binary -FSM-based. The concept can also cover non-binary fields. 

The internal state is initially set to zero. Afterwards, the internal state is sequentially 
updated in a limited number of steps. The output value H is then derived by truncating 
the final value of the internal state to the specified output size. At each step, the internal 
state is updated according to one of these two possibilities: either the whole internal state 
is updated as an affine transformation of the current internal state, M and V, or only one 
bit of the internal state is updated as a nonlinear Boolean function of the current internal 
state, M and V. Without loss of generality, we assume that all of the nonlinear updat- 
ing Boolean functions (NUBF) have zero constant term (i.e. the output of zero vector is 
zero) and none of the involved variables appear as a pure linear term (i.e. changing any 
input variable does not change the output bit with certainty). This assumption, coming 
from the simple observation that we can integrate constants and linear terms in an affine 
updating transformation (AUT), is essential for our analysis. Linear approximations of 
the FSM can be achieved by replacing AUTs with linear transformations by ignor- 
ing the constant terms and NUBFs with linear functions of their arguments. Similar to 
Section El this gives us a linearized version of the compression function which we de- 
note by Compress lin (M, V). As we are dealing with differential cryptanalysis, we take 
the notation Compress lin (M) = Compress lin (M, 0). The argument given in Section El 
is still valid: elements of the kernel of the linearized compression function (i.e. ZLs s.t. 
Compress ]in (Z\) = 0) can be used to construct differential trails. 

Let n n i denote the total number of NUBFs in the FSM. We count the NUBFs by 
starting from zero. We introduce four functions A(M, V),<I>(A), A A (M, V) and F(A) 
all of output size n n i bits. To define these functions, consider the two procedures which 
implement the FSMs of Compress(M, V) and Compress Un (Z\). Let the Boolean func- 
tion g k ,0 < k < n n i, stand for the A-th NUBF and denote its linear approximation as 
in Compress lin by g k in . Moreover, denote the input arguments of the Boolean functions 
g k and g k in in the FSMs which compute Compress(M, V) and Compress lin (zi) by the 
vectors x k and 5 k , respectively. Note that S k is a function of A whereas x k depends 
on M and V. The A-th bit of r(A), r k (A), is set to one iff the argument of the A;-th 
linearized NUBF is not the all-zero vector, i.e. r k (A) = 1 iff S k / 0. We then define 
A k (M, V) = g k ( x k ), $ k (A) = g k in (S k ) and A a (M, V) = g k (x k © S k ). We can then 
present the following proposition. The proof is given in the full version paper 0. 
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Proposition 2. Let Compress be a binary-FSM-based compression function. For any 
message difference A, let {fo, • • • ,i y - 1 }, 0 < io < i\ < ■ ■ • < i y - 1 < n n i be the 
positions ofl’s in the vector F(A) where y = wt(.T(Z\)). We define the condition 
function Y = Condition^ (M. V) where the j-th bit ofY is computed as 

Yj = Aij (M, V) ® A^ (M, V) ® <T> lj (A) . (6) 

Then, if A is in the kernel o/Compress lin , Conditionally, V) = 0 implies that the pair 
(M, M ® A) is a collision for Compress with the initial value V. 

Remark 2. The modular-addition-based compression functions can be implemented as 
binary-FSM-based compression by considering one bit memory for the carry bit. All the 
NUBFs for this FSM are of the form g(x,y,z ) = xy®xz®yz. The XOR approximation 
of modular addition in Section|2|corresponds to approximating all the NUBFs g by the 
zero function, i.e. gn n {x. y, z) = 0. It is straightforward to show that A k (M,V) = 
g(A k ,H k , Cfc) and $ k (A) = g lin (a k ,P k , 0). We then deduce that F k (A) = a k V 
(3 k V 0 and A k (M, V ) = g(A k ® a k ,B k ® P k ,C k ® 0). As a result we get 

Yj = A*. ( M , V ) ® Ag(M, V ) ® % (A) 

= (ay ® ® ctijBij ® Aij ® aij Pi. 

whenever a,;, V /3, ; = 1 ; this agrees with equation ®. Refer to the extended version 0 
for more details and to see how other linear approximations could be used. 

6 Application to MD6 

MD6 1251 - designed by Rivest et al., is a SHA-3 candidate that provides security proofs 
regarding some differential attacks. The core part of MD6 is the function / which 
works with 64-bit words and maps 89 input words (A 0 , . . . , Agg) into 16 output words 
(Ai 6 r-+ 73 , . . . , Aigr+ss) for some integer r representing the number of rounds. Each 
round is composed of 16 steps. The function f is computed based on the following 
recursion 

A-i+gg = L u t l t (Si ® Aj ® (Aj +7 1 A Aj + 68) ® (Aj + 58 A Aj + 22) ® A, + 72) , (8) 

where Si’s are some publicly known constants and L ri i i ’s are some known simple linear 
tr a nsformations. The 89-word input of / is of the form Q||C/||IU||iC||B where Q is a 
known 15-word constant value, U is a one-word node ID, W is a one-word control word, 
K is an 8-word key and B is a 64-word data block. For more details about function f and 
the mode of operation of MD6, we refer to the submission document Bin . We consider 
the compression function H = Compress(M, V) = /(Q||[/||IU||/('||_B) where V = 
U \ | W\ | K , M = B and H is the 16-word compressed value. Our goal is to find a collision 
Compress(M, V ) = Compress(M / , V) for arbitrary value of V. We later explain how 
such collisions can be translated into collisions for the MD6 hash function. 

According to our model (Section 0, MD6 can be implemented as an FSM which 
has 64 x 16r NUBFs of the form g(x. y, z, w) = x ■ y ® z ■ w. Remember that 

2 In the MD6 document C and are respectively denoted by V and gy*,;*- 
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the NUBFs must not include any linear part or constant term. We focus on the case 
where we approximate all NUBFs with the zero function. This corresponds to ignor- 
ing the AND operations in equation This essentially says that in order to compute 
Compress ]in (Z\) = Compress lin (A 0) for a 64-word A = (A 0 , . . . ,A 63 ), we map 
(A' 0 , . . . , A 24 , A/ 25 , . . . , A' 8g ) = 0||A = (0, ..., 0, A 0 , ..., A e3 ) into the 16 output 
words (A' 16r+7 3 , . . . , A' 16r+88 ) according to the linear recursion 

^+89 = L r - i; ; 4 (A^ ® A! i+n ) . (9) 

For a given A, the function r is the concatenation of 16r words A' +71 V A' :+68 V 
A' +58 V A' +22 , 0 < i < 16r — 1. Therefore, the number of bit conditions equals 

16r— 1 

y— ^(^i+71 V ^-i+68 V ^i +58 V A' +22 )- (10) 

i=0 

Note that this equation compactly integrates cases 1 and 2 given in section 6. 9. 3. 2 
of 1231 for counting the number of active AND gates. Algorithm 3 in the extended ver- 
sion of this article m shows how the Condition function is implemented using equa- 
tions ©, © and(0. 

Using a similar linear algebraic method to the one used in Section lOl for CubeFIash, 
we have found the collision difference of equation o for r = 16 rounds with a raw 
probability pa = 2 -90 . In other words, A is in the kernel of Compress lin and the 
condition function has y = 90 output bits. Note that this does not contradict the proven 
bound in Oh one gets at least 26 active AND gates. 

( F6D164597089C40E i = 2 

2000000000000000 * = 36 (11) 

0 0<*<63,*^2,36 

In order to efficiently find a conforming message pair for this differential path we need 
to analyze the dependency table of its condition function. Referring to our notations 
in Section IT21 our analysis of the dependency table of function Condition/j(M, 0) at 
word level (units of u = 64 bits) shows that the partitioning of the condition function 
is as in Tabled for threshold value th = 0. For this threshold value clearly Pi = 0. 
The optimal values for q^s (computed according to the complexity analysis of the same 
section) are also given in Table 0 showing a total attack complexity of 2 30 - 6 (partial) 
condition function evaluatiorQ. By analyzing the dependency table with smaller units 
the complexity may be subject to reduction. 

A collision example for r = 16 rounds of / can be found in the full version 191 . 
Our 16-round colliding pair provides near collisions for r = 17, 18 and 19 rounds, 
respectively, with 63, 144 and 270 bit differences over the 1024-bit long output of /. 
Refer to 0 to see how collisions for reduced-round / can be turned into collisions for 
reduced-round MD6 hash function. The original MD6 submission E3 mentions inver- 
sion of the function / up to a dozen rounds using SAT solvers. Some slight nonrandom 
behavior of the function / up to 33 rounds has also been reported D3- 

3 By masking M 38 and M 55 respectively with 092E9BA68F763BF1 and 
DFFBFF7FEFFDFFBF after random setting, the 35 condition hits of the first three 
steps are satisfied for free, reducing the complexity to 2 30 0 instead. 
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Table 4. Input and output partitionings of the Condition function of MD6 with r = 16 rounds 


l 

Mi 

Vi 

<H 


0 


0 

0 

0 

1 

{M 38 } 

{Yi, . . . , Y29} 

29 

0 

2 

{M 55 } 

W*. •••.»■} 

6 

0 

3 

{Mo, M 5 , M 46 , M52, M 64 } 

{Lo} 


0 

4 

{Mj if-st 3, 4, 6, 9, 21, 36, 39, 40, 42, 45, 49, 50, 53, 56, 57} 

{V81,...,V3«} 

6 

0 

5 

{Mi i , M 51 , M 5S , M 59 , Moo} 

{Y»,Y, X } 

2 

0 

(> 

{Mj\j = 1, 7, 8, 10, 11, 12, 17, 18, 20, 22, 24, 25, 26, 29, 
33, 34, 37, 43, 44, 47, 48, 61, 62, 63} 

{T 5 2,...,Y 67 } 

6 

0 


{ m 27 } 

{Y 37t ...,Yi 2 } 


0 


{Mi 3 ,Mi 6 ,M 23 } 

{T50} 


0 

9 

{m 35 } 

{Yio} 

1 

0 

10 

{A/14, A/15, A/19, A/28 } 

{Ybs.Vb 1} 

2 

0 

H 

{A/30, Af 3 i, A/32} 

{Y 59 ,Y 60 ,Y 6 2 ■ ■ ■ ,Y ag } 

30 

0 


7 Conclusion 

We presented a framework for an in-depth study of linear differential attacks on hash 
functions. We applied our method to reduced round variants of CubeHash and MD6, 
giving by far the best known collision attacks on these SHA-3 candidates. Our results 
may be improved by considering start-in-the middle attacks if the attacker is allowed to 
choose the initial value of the internal state. 
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Abstract. In this paper, we present preimage attacks on up to 43- 
step SHA-256 (around 67% of the total 64 steps) and 46-step SHA-512 
(around 57.5% of the total 80 steps), which significantly increases the 
number of attacked steps compared to the best previously published 
preimage attack working for 24 steps. The time complexities are 2 251 ' 9 , 
2 509 for finding pseudo-preimages and 2 254 ' 9 , 2 511 ' 6 compression func- 
tion operations for full preimages. The memory requirements are mod- 
est, around 2 6 words for 43-step SHA-256 and 46-step SHA-512. The 
pseudo-preimage attack also applies to 43-step SHA-224 and SHA-384. 
Our attack is a meet-in-the-middle attack that uses a range of novel 
techniques to split the function into two independent parts that can be 
computed separately and then matched in a birthday-style phase. 

Keywords: SHA-256, SHA-512, hash, preimage attack, 

meet-in-the-middle. 


1 Introduction 

Cryptographic hash functions are important building blocks of many secure sys- 
tems. SHA-1 and SHA-2 (SHA-224, SHA-256, SHA-384, and SHA-512) P are 
hash functions standardized by the National Institute of Standards and Tech- 
nology (NIST) and widely used all over the world. However, a collision attack 
on SHA-1 has been discovered recently by Wang et al. |2j. Since the structure of 
SHA-2 is similar to SHA-1 and they are both heuristic designs with no known 

* This work was done while visiting Technical University of Denmark and was partly 
supported by a DCAMM grant. 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 578^597] 2009. 
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security guarantees or reductions, an attack on SHA-2 might be discovered in 
the future too. To avoid a situation when all FIPS standardized functions would 
be broken, NIST is currently conducting a competition to determine a new hash 
function standard called SHA-3 PI- From the engineering viewpoint, migration 
from SHA-1 to SHA-3 will take a long time. SHA-2 will take an important role 
during that transitional period. Hence, rigorous security evaluation of SHA-2 
using the latest analytic techniques is important. 

NIST requires SHA-3 candidates of n-bit hash length to satisfy a several 
security properties |3j , first and foremost 

— Preimage resistance of n bits, 

— Second-preimage resistance of n — k bits for any message shorter than 2 fc 
blocks, 

— Collision resistance of n/2 bits. 

NIST claims that the security of each candidate is evaluated in the environment 
where they are tuned so that they run as fast as SHA-2 |E| . It seems that NIST 
tries to evaluate each candidate by comparing it with SHA-2. However, the 
security of SHA-2 is not well understood yet. Hence, the evaluation of the security 
of SHA-2 with respect to the security requirements for SHA-3 candidates is also 
important as it may influence our perspective on the SHA-3 speed requirements. 

SHA-256 and SHA-512 consist of 64 steps and 80 steps, respectively. The first 
analysis of SHA-2 with respect to collision resistance was described by Mendel 
et al. pi, which presented the collision attack on SHA-2 reduced to 19 steps. 
After that, several researches have improved the result. In particular, the work 
by Nikolic and Biryukov improved the collision techniques p| . The best collision 
attacks so far are the ones proposed by Indesteege et al. [Z| and Sanadhya and 
Sarkar jS], both describing collision attacks for 24 steps. The only analysis of 
preimage resistance we are aware of is a recent attack on 24 steps of SHA-2 due 
to Isobe and Shibutani 0 . 

One may note the work announced at the rump session by Yu and Wang m, 
which claimed to have found a non-randomness property of SHA-256 reduced 
to 39 steps. Since the non-randomness property is not included in the security 
requirements for SHA-3, we do not discuss it in this paper. In summary, the 
current best attacks on SHA-2 with respect to the security requirements for 
SHA-3 work for only 24 steps. 

After Saarinen srn and Leurent m showed examples of meet-in-the-middle 
preimage attacks, the techniques for such preimage attacks have been developed 
very rapidly. Attacks based on the concept of meet-in-the-middle have been re- 
ported for various hash functions, for example MD5 [T^j, SHA-1, HAVAL fT~!| . 
and so on jl 511 611 7lT%l| . The meet-in-the-middle preimage attack is also applied 
to recently designed hash function ARIRANG f 1 9| . which is one of SHA-3 can- 
didates, by Hong et al. m, However, due to the complex message schedule in 
SHA-2, these recently developed techniques have not been applied to SHA-2 yet. 


Our contribution. We propose preimage attacks on 43-step SHA-256 and 46- 
step SHA-512 which drastically increase the number of attacked steps compared 
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to the previous preimage attack on 24 steps. We first explain various attack 
techniques for attacking SHA-2. We then explain how to combine these tech- 
niques to maximize the number of attacked steps. It is interesting that more 
steps of SHA-512 can be attacked than of SHA-256 with so-called partial-fixing 
technique proposed by Aoki and Sasaki This is due to the difference of the 
word size as functions a and S mix 32-bit variables in SHA-256 more rapidly 
than in the case of double-size variables in SHA-512. 

Our attacks are meet-in-the-middle. We first consider the application of the 
previous meet-in-the-middle techniques to SHA-2. We then analyse the message 
expansion of SHA-2 by considering all previous techniques and construct the 
attack by finding new independent message-word partition, which is the funda- 
mental part of this attack. 

Our attacks and a comparison with other results are summarized in Table Q 


Table 1 . Comparison of preimage attacks on reduced SHA-2 


Reference 

Target 

Steps 

Complexity 

Memory 

(approx.) 

Pseudo-preimage 

Preimage 

Ours Section E 

SHA-224 

43 

2 219.9 


2 6 words 

13 

SHA-256 

24 

2 240 

2 240 

2 16 • 64 bits 

Ours Section E 

SHA-256 

42 

2 245.3 

2 2S1.7 

2 12 words 

Ours Section E 

SHA-256 

43 

2 2S1.9 

2 254.9 

2 6 words 

Ours Section E 

SHA-384 

43 

2366 


2 19 words 

13 

SHA-512 

24 

2 480 

2 480 

not given 

Ours Section E 

SHA-512 

42 

2 488 

2 801 

2 27 words 

Ours Section E 

SHA-512 

46 

2 509 

2 511.5 

2 6 words 


Outline. In Section 0 we briefly describe SHA-2. Section Ogives an overview of 
the meet-in-the-middle preimage attack. In Sectional we describe all techniques 
of our preimage attack. Then Sections 0 and El explain how these techniques can 
be applied together to mount an attack on SHA-256 and SHA-512, respectively. 
In Section 0 we put some remark on our attack. Section 0 concludes this paper. 

2 SHA-2 Specification 

Description of SHA-256. In this section we describe SHA-256, consult 0 for 
full details. SHA-256 adopts the Merkle-Damgard structure )‘2ll Algorithm 9.25]. 
The message string is first padded with a single “1” bit, appropriate number of 
zero bits and then 64-bit length of the original message so that the length of the 
padded message is a multiple of 512 bits and then divided into 512-bit blocks, 
(M 0 , M u ... , Mjv-i) where Mi e {0, l} 512 . 

The hash value Hn is computed by iteratively using the compression function 
CF, which takes a 512-bit message block and a 256-bit chaining variable as the 
input and yields an updated 256-bit chaining variable as the output, 
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f ho <— IV, 

\ h i+ 1 e- CF(/ij, Mi) (i = 0, 1, . . .,N - 1), w 

where IV is a constant value defined in the specification. 

The compression function is based on the Davies-Meyer mode (2U Algorithm 
9.42]. It consists of a message expansion and a data processing. Let >a: and 
denote the ir-bit right shift and rotation, respectively. First, the message block 
is expanded by the message expansion function, 


Wi 


rrii 

<ri(Wi- 2 ) + Wt - 7 + <ro(Wi-u>) + is 


for 0 < i < 16 , 
for 16 < * < 64 . 


where (mo, mi, ... , mi 5 ) M, ( mj G {0, l} 32 ) and denotes addition mod- 
ulo 2 word - slze . In SHA-256 the word size is 32 bits. Functions do (A) and <ti (A) 
are defined as 

d 0 (A)^(A» 7 )©(A» 18 )®(A» 3 ), 

di(A) <- (A>» 17 ) © (A>» 19 ) © (A» 10 ). W 

where “ffi” stands for bitwise XOR operation. 

Let us use pj to denote a 256-bit value consisting of the concatenation of eight 
words Aj,Bj,Cj, Dj,Ej,Fj,Gj and Hj. The data processing computes h i+ 1 as 
follows. 

{ Po <- hi, 

Pj +i Rj (Pj >Wj), (i = 0, 1, . . . ,63) (4) 

hi + 1 <— hi +P64, 

Step function Rj is defined as follows 

{ T[ j) <- Hj + S x {Ej) + Ch {Ej,Fj,Gj) + Kj + Wj, 

T^^Eo{Aj) + Mo\{Aj,Bj,Cj), 

A j+1 <- T[ 3) + T 2 W , B j+1 «- Aj, C j+1 <- Bj, D j+1 «_ Cj, 

Ejj-i <— Dj + T±\ F j+ i <— Ej, G j+ i <— Fj, H j+i <— Gj. 

Above, Kj is a constant, different for each step, and the following functions are 
used 

Ch(A, Y, Z) <- (A V Y) © ((-.A) V Z), 

Maj(A, Y, Z) <- (A V Y) © (A V Z) © (Y V Z), . . 

A 0 (A)^(A^ 2 )ffi(A^ 13 )©(A^ 22 ), W 

A X (A) 4- (A^ 6 ) © (A^ 11 ) © (A^ 25 ). 

where -■ means bitwise negation of the word. 


Description of SHA-512. The structure of SHA-512 is basically the same as 
SHA-256. In SHA-512, the word size is 64 bits, double of SHA-256, hence, the 
message-block size is 1024 bits and the size of chaining variable pj is 512 bits. 
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The compression function has 80 steps. Rotation numbers in cro, and 

are different from those used in SHA-256, which are shown below. 


a 0 (X) «- (X^ 1 ) ® (X»> 8 ) © (X» 7 ), 

<n (X) <- (X^ 19 ) © (X^ 61 ) ® (X» 6 ), 
E 0 (X) «- (X^ 28 ) © (X^ 34 ) © (X^ 39 ), 
Ei(X) «- (X 3 ^ 14 ) © (X>» 18 ) © (X >>41 ). 


(7) 


3 Overview of the Meet-in-the-Middle Preimage Attack 

A preimage attack on a narrow-pipe Merkle-Damgard hash function is usually 
based on a pseudo-preimage attack on its underlying compression function, where 
a pseudo-preimage is a preimage of the compression function with an appro- 
priate padding. Many compression functions adopt Davies-Meyer mode, which 
computes E u (v ) © v, where u is the message, v is the intermediate hash value 
and E is a block cipher. 

First we recall the attack strategy on a compression function, which has been 
illustrated in Fig. 0 Denote by h the given target hash value. The high-level 
description of the attack for the simplest case is as follows. 

1. Divide the key u of the block cipher E into two independent parts: u\ and 
U 2 ■ Hereafter, independent parts are called “chunks” and independent inputs 
ui and u -2 are called “neutral words” . 

2. Randomly determine the other input value v of the block cipher E. 

3. Carry out the forward calculation utilizing v and all possible values of U\, 
and store all the obtained intermediate values in a table Tp. 

4. Carry out the backward calculation utilizing h © v and all possible values of 
U 2 , and store all the intermediate values in a table Tp. 

5. Check whether there exists a collision between Tp and Tp. If a collision 
exists, a pseudo-preimage of h has been generated. Otherwise, go to Step 2. 

The main novelty of the meet-in-the-middle preimage attacks is, by utilizing 
independence of u\ and u% of the key input, transforming the problem of find- 
ing a preimage of h to the problem of finding a collision on the intermediate 
values, which has a much lower complexity than the former one. Suppose there 


E 


I 



Fig. 1 . Meet-in-the-middle attack strategy 
E u {v) ® v 


Davies-Meyer compression function 
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are 2* possible values for each of ui and U 2 - Using 2* compression function com- 
putations, the attacker obtains 2* elements in each of Tp and Tg. The collision 
probability is roughly 2 2t ~ n , where n is the bit length of h, much better than the 
probability 2 t ~ n of finding a preimage by a brute force search with complexity 2*. 

4 The List of Attack Techniques 

This section describes the list of techniques used in the attack. Some of them 
were used before in previous meet-in-the- middle attacks [1 511 811 fil| . We explain 
them here first and then in Sections 0 and 0 we show how to combine them in 
an attack on SHA-2. 


4.1 Splice-and-Cut 

The meet-in-the-middle attack starts with dividing the key input into two in- 
dependent parts. The idea of splice-and-cut is based on the observation made 
in fn>j that the last and first steps of the block cipher E in Davies-Meyer mode 
can be regarded as consecutive by considering the feed-forward operation. 

This allows the attacker to choose any step as the starting step of the meet- 
in-the-middle, which helps with finding more suitable independent chunks. 

This technique can find only pseudo-preimages of the given hash value instead 
of preimages. However, pseudo-preimages can be converted to preimages with a 
conversion algorithm explained below. 


4.2 Converting Pseudo-preimages to Preimages 

In rc-bit iterated hash functions, a pseudo-preimage attack with complexity 
2 y ,y < x — 2 can be converted to a preimage attack with complexity of 2“i 1+1 
|m Fact9.99]. The idea is applying the unbalanced meet-in-the-middle attack 
with generating 2 ( ~ x ~ y ^ 2 pseudo-preimages and generating 2 < - x+y ^ 2 1-block 
chaining variables starting from IV. 

4.3 Partial-Matching 

The example in Fig.Q]is the simplest and optimistic case. In fact, in the previous 
attacks, the key input cannot be divided into just two independent chunks. 
Usually besides the two independent chunks u\ and 1 / 2 , there is another part, 
which depends on both m and U 2 - Hence, the stored intermediate values in Tp 
and T b are ones at different steps. This raises a problem: how the values in Tp 
and T b can be compared. However, many hash functions, including SHA-2, have 
Unbalanced Feistel Network structure, where the intermediate values will only be 
updated partially at one step. This means that a part of the intermediate values 
does not change during several steps and the attacker can check the match of 
two values partially. 
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Consider SHA-2, assume one chunk produces the value of pj and the other 
chunk produces the value of Pj+ a - The attacker wants to efficiently check whether 
or not pj and pj +s match without the knowledge of Wj,Wj+ 1, . . . , Wj +S - In 
SHA-2, the maximum number of s is 7. 

Assume the value of Pj +7 = Aj + 7 \\Bj + 7 \\ ■ ■ ■ \\Hj + j is known and Wj+ e is un- 
known. By backward computation, we can obtain the values of Aj+ 6 ,Bj + 6, . . . , 
G j+ 6- This is because Aj +e ,Bj +e ,Cj + e,Ej +e ,Fj +e , and Gj + e are just copies of 
corresponding values in Pj+7 and D j+ 6 is computed as follows. 

Dj+ 6 •*— Ej + 7 - (A j+ 7 - (E 0 (Bj +7 ) + Maj(B J+7 , C j+ 7, D j+ 7))). (8) 

By repeating the similar computation, in the end, Aj is computed from Pj+7 
without the knowledge of Wj, Wj +\, . . . , Wj + @. Note that this technique was 
already used (but not explicitly named) in . 


4.4 Partial-Fixing 

This is an extension of the partial-matching technique that considers parts of 
registers of the internal state. It increases the number of steps that can exist 
between two independent chunks. Assume that the attacker is carrying out the 
computation using u\ and he is facing a step whose key input depends on both 
ui and u< 2 - Because the computation cannot go ahead without the knowledge 
of u 2 , the chunk for m must stop at this step. The partial-fixing technique is 
partially fixing the values of ui and u 2 so that we can obtain partial knowledge 
even if the full computation depends on both U\ and u 2 . 

The partial-fixing technique for SHA-2 has not been considered previously. 
Assume we can fix the lower x bits of the message word in each step. Under this 
assumption, 1 step can be partially computed easily. Let us consider the step 
function of SHA-2 in the forward direction. Equations using Wj is as follows. 

T P Hj + St{Ej) + Ch {Ej,F j: Gj) + Kj + W h . . 

A j+ 1 4- T[ j) + T 2 0) , E m «- Dj + T-P. 

If the lower x bits of Wj are fixed, the lower x bits of Aj + 1 (and Ej + 1) can be 
computed independently of the upper 32 — x bits of Wj. Let us consider to skip 
another step in forward direction. The equation for Aj+ 2 is as follows: 

A j+2 «- Tp +1) + E 0 (A j+1 ) + Maj (A j+1 ,B j+1 ,Cj +1 ). (10) 

We know only the lower x bits on Aj + i. Hence, we can compute Maj function for 
only the lower x bits. How about the Eq function? We analysed the relationship 
of the number of consecutive fixed bits from LSB in the input and output of 
do, <ti, i?o, and The results are summarized in Table El 

From Table El if x is large enough, we can compute the lower x — 22 bits of 
Aj + 2 in SHA-256 and the lower x — 39 bits in SHA-512, though the number 
of known bits is greatly reduced after the Eq function. This fact also implies 
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Table 2. Relationship of number of consecutive fixed bits from LSB in input and 
output of ct and E 



SHA-256 

SHA-512 


Eo E 1 (To (71 

Eo E 1 cto cti 

Input 

X X X X 

X X X X 

output 

x - 22 X - 25 x - 18 x - 19 

x — 39x — Alx — 8x — 61 


When x agrees with the word size, the output is x. When the number described in the 
output is negative, the output is 0. 

that we cannot obtain the value of A J+ 3 since the number of fixed bits will be 
always 0. In the end, the partial- fixing technique can be applied for up to 2 
steps in forward direction. Similarly, we considered the partial-fixing technique 
in backward, and found that it can be applied up to 6 steps. 

However we have another problem in the first assumption; the lower x bits 
of each message word can be fixed. This is difficult to achieve because the fixed 
bits in message words are mixed by the cr function in the message expansion. 
In fact, we could apply the partial- fixing technique for computing only 1 step in 
forward, and only 2 steps in backward for SHA-256. However, in SHA-512, the 
bit-mixing speed of a is relatively slow due to the double word size. In fact, we 
could compute 2 steps in forward, and 6 steps in backward. Finally, 10 steps in 
total can be skipped by the partial-matching and partial-fixing techniques for 
SHA-256, and 15 steps for SHA-512. (These numbers of steps are explained in 
Sections 0 and El) 

4.5 Indirect-Partial-Matching 

This is another extension of partial-matching. Consider the intermediate values 
in T f and T B . We can express them as functions of u± and 1 / 2 , respectively. If the 
next message word used in forward direction can be expressed as ipi (ui) +^ 2 (^ 2 ) 
and computation of chaining register at the matching point does not destroy this 
relation (because the message word is also added), the matching point can still be 
expressed as a sum of two independent functions of ui, u- 2 , e.g. + £f(u 2 )- 

Similarly, we can express the matching point from backward as V’sC^i) +£ 5 (^ 2 ), 
and we are to find match. Now, instead of finding a match directly, we can 
compute iPf(ui) — iPb(ui) in forward direction and £ 3 (^ 2 ) — £f(^ 2 ) in backward 
direction independently and find a match. 

In case of SHA-2, it is possible to extend the 7-step partial-matching to 9-step 
indirect-partial-matching by inserting one step just before and after the partial 
matching. 

Note this technique can be combined with partial-fixing technique by apply- 
ing them in order: partial-fixing, partial-matching and indirect-partial-matching. 
However, there are some constraints that need to be satisfied, such as the inde- 
pendence of message word used in indirect-partial-matching, while we need to 
be able to compute enough bits at the matching point in order to carry out the 
partial-matching efficiently. 
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4.6 Initial Structure 

In some cases, the two independent chunks u-\ and will overlap with each other. 
The typical example is that the order of the input key of E is U\U2U\U2- This 
creates a problem: how should the attacker carry out the forward and backward 
computations independently. The Initial Structure technique was proposed by 
m to solve such a problem. Previous attacks usually set a certain step as the 
starting step, then randomly determine the intermediate value at that step, and 
carry out the independent computations. However, the initial structure technique 
sets all the steps of U 2 U 1 in the middle of Mi it 2 w rit 2 together as the starting 
point. Denote the intermediate values at the beginning and last step of U2U1 
as 1\ and I 2 respectively. For each possible value of m, the attacker can derive 
a corresponding value I\. Similarly, for each possible value of U2, the attacker 
can derive a corresponding value I2 ■ Moreover, any pair (fy , U \ ) and (fy, u-i) can 
be matched at the steps of U2U\ of u\U2UiU2- Thus, the attacker can carry out 
independent computations utilizing and (fy, '«2)- 

Initial structure for SHA-2 makes use of the absorption property of the func- 
tion Q\(x , y,z) = xy@{~<x)z. If a; is 1 (all bits are 1), then Ch(l, y,z) = y which 
means z does not affect the result of Ch function in this case; similarly when x 
is 0 (all bits are 0), y does not affect the result. When we want to control partial 
output (few bits), we need to fix the corresponding bits of x instead of all bits 
of x. 

We consider 4 consecutive step functions, i.e. from step i to step i + 3. We 
show that, under certain conditions, we can move the last message word Wj+3 
to step i and move Wi to step i + 1 while keeping the final output after step i + 3 
unchanged. 

Assume we want to transfer upwards a message word Wj+3. Due to the ab- 
sorption property of Ch, we can move Wi+ 3 to step i + 2 (adding it to register 
Gi + 2) if all the bits of E i+2 are fixed to 1. This is illustrated in Fig. 0 (left). 
Similarly, we can further move Wj+ 3 to step i + 1 (adding it to register F i+ i) if 
all the bits of Ei + 1 are 0. Then, we still can move it upwards by transferring it 
to register Ei after step transformation in step i. 

The same principle applies if we want to transfer only part of the register 
Wj+ 3. If l most significant bits (MSB) of Wi + 3 are arbitrary and the rest is set 
to zero (to avoid interference with addition on least significant bits), we need to 
fix l MSB of E i+ 2 to one and l MSB of E i+ 1 to zero. 

As l MSB of E i+ 1 need to be 0, we need to use l MSB of Wj to satisfy this 
requirement. This reduces the space of Wi to 2 32_i . Similarly, we need to choose 
those Wi that fix l MSB of E i+2 to one. This is possible because changing the 
value of Wi influences the state of register E i+ 2 through E\ at step i + 1. We 
experimentally checked that changing Wj generates changes in E i+ 2 that are 
sufficiently close to uniformly distributed. Satisfying additional constraints on l 
bits further reduces the space of Wi to 2 32_2i . 

The important thing to note here is that if we fix the values of Fi+i, G i+ 1 
and of the sum D i+ [ + H l+i we can precompute the set of good values for W- L 
and store them in a table. Then, we can later recall them at negligible cost. 
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Fig. 2. Initial structure for SHA-2 allows to move the addition of Wi+3 upwards pro- 
vided that the Ch functions absorb the appropriate inputs (left); move Wi one step 
downwards (right) 

On the other hand, message word Wi can be moved to step i + 1 with no 
constraint, as shown in Fig. 0 (right). 

This procedure essentially swaps the order of words Wi and Wi + 3 . 

4.7 Two-Way Expansion 

Message expansion usually works in such a way that some consecutive several 
messages can determine the rest. For SHA-2, any consecutive 16 message words 
can determine the rest since the message expansion is a bijective mapping. This 
enables us to control any intermediate 16 message words and then expand the 
rest in both ways. This technique gives us more freedom of choices of neutral 
words, and extends the number of steps for the two chunks a lot. Note that the 
maximum number of consecutive steps for the two chunks is 30 for SHA-2. Since 
the message expansion is a bijective mapping, no matter which neutral word 
is chosen, it must be used to compute at least one of the any consecutive 16 
message words. So each chunk of consecutive steps is of length at most 15. 

4.8 Message Compensation 

For some choice of neutral words, two chunks are not able to achieve the optimal 
length. By forcing some of the other message words to cancel the change intro- 
duced by neutral words, the optimal or near-optimal length could be achieved. 
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Combining the initial structure, two-way expansion and message compensa- 
tion techniques, we are able to find two chunks of length 33. We choose to control 
on {144, . . . , W z + 15}, for some z which we will determine later. We choose 144+5 
and 444+8 as neutral words. We show the first chunk {444- 10 , ■ • ■ , 444+4, 444+8 } 
to be independent from 444+5 and second chunk {444+5, 444+6, 444+7, 444+9, ■ • ■ , 
444+22} to be independent from 444+8- Note that 444+8 is “moved” to first chunk 
by method explained in initial structure. For forward direction, we need to show 
{444-io, • ■ ■ , 444:—i} are independent from 444+5 when they are expanded from 

{444,..., 444+is}- 


144-i = 444+is - ai (444+is) - 444+8 - o-o(444) , (11) 

144-2 = 444+14 - ai (144+12) - 144+ 7 - <7 0 (444-i) , (12) 

144_3 = 444+i 3 - 0-1(144+11) - 144+e - <7 o(444- 2 ) , (13) 

144_4 = 444+12 - <7i(444+io) -W z+5 - <7 0 (444-a) , (14) 

144-s = 444+u- <7i(444+9)- 144+4- <7 0 (444-4) , (15) 

144_ 6 = 444+io - <7i (444+g) - 444+3 - <7 0 (444-s) , (16) 

144_7 = 444+9 - <7i(444+7) - 444+2 - <7 0 (444-e) , (17) 

144_8 = 444+8 - <7i (144+e)- 444+1 - <7 0 (444_ 7 ) , (18) 

144—9 = 444+7 - <7i(W z +s) - 444 - <70 (444-s) , (19) 

444-io = t44+ 6 - <7i(444+4)- 414-1 — <7 o(444_ 9 ) • (20) 


We note that 444+5 is used in (TflUl and (THEI) , we compensate them by using 144+7 
and 444+12- By “compensating” we mean making the equation value independent 
from 444+5 by forcing 144+7 — oq (144+5 ) = C (C is some constant, we use 0 for 
simplicity) and 144+12 — 444+5 = C. 444+7 is also used in (1T7I1 , however we can 
use 444+9 to compensate for it, i.e. set 144+g = <71(144+7) = <7 -{(144+5)- Then 
444+g and 144+12 are used in steps above, so we continue this recursively and 
finally have the following constraints that ensure the proper compensation of 
values of 144+5. 

144+7 = <7i(444+5) , 

144+g = <7? (144+s) , 

144+11 = erf (444+5) , 

144+13 = of (444+s) , (21) 

444+15 — ®i (444+5) , 

444+12 = 444+5 , 

444+14 = 2 <7i(444+ 5 ) - 

The second chunk is independent from 444+8 automatically without any com- 
pensation. The 33-step two-chunk is valid regardless of the choice of 2 as long 
as 2 > 10. To simplify the notation, we use 444, ... , 444+32 to denote the two 
chunks, then I44-+15 and 144+is are the two neutral words. We reserve the final 
choice of j for later to pick the one that allows to attack the most steps, as 
described later. 
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5 Preimage Attack against 43 Steps SHA-256 

5.1 Number of Attacked Steps 

The attack on SHA-256 uses 33-step two-chunk Wj , . . . , Wj + 32 explained in 
Section 01 Hence, in forward direction, Pj+33 can be computed independently of 
the other chunk and in backward direction, pj can be computed independently 
of the other chunk. We extend the number of attacked steps as much as possible 
with partial- fixing (PF) and indirect-partial-matching (IPM) techniques. 

Forward computation of Aj + 34: The equation for Aj + 34 is as follows. 

[ Aj + 34 = >£q(A,- + 33) + Maj(Aj_|_33, B j+33 , C j+33 ) + H j+ 33 
< # £i(Ej +33 ) + Ch(Ej +33 , Fj +33 ,Gj +33 ) + Kj + 33 + Wj + 33, 

1 Wj + 33 = Ci(Wj + 3 i) + Wj- 1_26 + &o{Wj + is) + Wj - (-17 

We can use either PF or IPM to compute Aj +3 4. If we use PF, we fix the 
lower l bits of Wj+is, which is a neutral word for the other chunk. According 
to Table E| this fixes the lower l — 18 bits of tro(Wj+ is). Finally, the lower 
l — 18 bits of Aj_|_ 34 can be computed. If we use IPM, we describe A ]+3 4 as a 
sum of functions of each neutral words i.e. Aj + 34 = xpp (Wj+15) 4-£>(Wj'+i8 )• 
From the above equations, they can be easily done. Note that IPM is more 
efficient than PF with respect to only computing A,- +3 4 because IPM does 
not need to fix a part of neutral word. 

Forward computation of Aj + 35: The equation for Aj +3 5 is as follows. 

f Aj + 35 = E 0 (Aj +3 4) + Maj(Ay + 34, Bj +3 4, Cj +3 4) 4 1- Wj+ 34, 

1 Wj+34 = <n(W j+32 ) + Wj + 27 4- CTo(Wj+19) 4- 

Neither PF nor IPM can compute Aj +35 . If we used PF for Aj +3 4, only the 
lower Z — 18 bits are known. This makes all bits of Aj + 35 unknown after 
the computation of E 0 {Aj +3 ^). If we used IPM, Aj +3 4 is described as a 
sum of two independent functions. However, because E 0 consists of XOR of 
three self-rotations, it seems difficult to describe To (A, +34) as a sum of two 
independent functions. 

In summary, we can skip only 1 step in forward. In this case, using IPM is more 
efficient than using PF. 

Backward computation of Hj_\: The equation for Hj_ 1 is as follows. 

f Hj-t = Aj - (E 0 (Bj) + Ma](Bj,Cj,Dj)) 

{ ~ EliFj) ~ Ch (Fj,Gj,Hj) - Kj_i - Wj-u 

[ Wj-x = W j+ is - c7x(W j+13 ) - W j+8 + a 0 (Wj) 

We can use either PF or IPM to compute Hj_x- If we use PF, we fix the 
lower l bits of Wj + 15, and then, the lower l bits of ZZj-i can be computed. 
If we use IPM, we describe Hj_ 1 as a sum of functions of each neutral word. 


Fig. 3. Separation of chunks and dependencies of state words for SHA-256 


Backward computation of Hj- 2: The equation for Hj- 2 is as follows. 


Hj- 2 = Af- 1 - Cj-i, 



We can use PF to compute Hj - 2 but cannot use IPM. To describe Ch(_F 7 _i, 
Gj- 1 , Hj- 1) and ao(Wj-i) as a sum of two independent functions seems diffi- 
cult. If we used PF for Hj- 1, we can obtain the lower l bits of Ch(Fj_i, Gj-i, 
Hj- 1) and lower l — 18 bits of ao(Wj~i). Finally, we can compute the lower 
l — 18 bits of Hj - 2 . 

By the similar analysis, we confirmed that we cannot compute Hj- 3. In summary, 
we can skip 2 steps in backward with PF which fixes the lower l, l > 18 bits of 


The attack uses 33-step two-chunk Wj,. . . ,Wj+ 32 including 4-step initial 
structure. Apply PF for Wj-i and Wj- 2, and apply IPM for Wj+ 34. Finally, 
43 steps are attacked by skipping additional 7 steps using partial-matching 
technique. 

36 steps (Wj- 2 to Wj+34) must be located sequentially. We have several op- 
tions for j. We choose j = 3 for the following two purposes; (1) Hfy. W\ 4 , and 
W15 can be freely chosen to satisfy message padding rules, (2) pseudo-preimage 
attack on SHA-224 is possible (explained in Section 0 . 

We need to fix the lower l + 18 bits of Wig to fix the lower l bits of W 2 by PF. 
Besides, we lose half of remaining freedom to construct 4-step initial structure. 
Hence, we choose l to balance l — 18 and i.e. we choose l = 23. 

The overview of the separation of chunks is shown in Fig. 01 □ denotes 
variables depending only on W-i\ : ■ denotes variables depending only on Wig', 
a and H denote registers that can be expressed as a sum modulo 2 32 of two 
independent functions of neutral variables W\g and Iffy ; B denotes registers 
with few bits depending only on Iffy; [X denotes registers depending on both 
Wi8 and Iffy in a complicated way. 
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5.2 Attack Procedure 

1. Randomly choose the values for internal chaining variable pig (after the 
movement of message words by initial structure) and message word W 19 . 
Randomly fix the lower 23 bits of Wis- By using the remaining 9 free bits 
of Wi8, find 2 5 values on average that correctly construct the 4-step initial 
structure, and store them in the table Tfy ■ Let us call this an initial table- 
preparation. 

2. Randomly choose message words not related to initial structure and neutral 
words, i.e. PP13, W14, PP15, TA^i6, PP23. Let us call this an initial 
configuration. 

3. For all 2 5 possible in T w , compute the corresponding W 20 , W 22 , W 24 , 
W 25 , W -26 , W 27 , W 28 as shown in equations (ED- Compute forward and find 
ipFiWis). Store the pairs (Wis, iPf(Wis)) in a list L F . 

4. For all 2 4 possible values (the lower 4 bits) of W 21 , compute backward and 
find £f(W 21), which is cro(W2i) in this attack, and the lower 4 bits of A37. 

5. Compare the lower 4 bits of A37 — cro(W2i) and the lower 4 bits of ip f (W ig) 
stored in L F . 

6. If a match is found, compute A 37 , S 37 , . . . , H i7 with the corresponding W\g 
and W 21 and check whether results from both directions match each other. 
If they do, output po and Wq. . . . , Wi 5 as a pseudo-preimage. 

7. Repeat steps 2 - 6 for all possible choices of Wig, Wi g , Wi 7 , W 21 ■ Note, the 
MSB of W 13 is fixed to 1 to satisfy message padding. Hence, we have 2 127 
freedom for this step. 

8. If no freedom remains in step 7, repeat steps 1-7. 

9. Repeat steps 1 - 8 2 4 times to obtain 2 4 pseudo-preimages. Then, convert 
them to a preimage according to Fact9.99]. 

5.3 Complexity Estimation 

We assume the complexity for 1 step function and 1-step message expansion 
is jg compression function operation of 43-step SHA-256. We also assume that 
the speed of memory access is negligible compared to computation time for step 
function and message expansion. Complexity for step d is 2 9 and use a memory 
of 2 5 words. Complexity for step 0 is negligible. In step 0 we compute Pj+i <— 
Rj (pj , Wj ) for j = 18, 19, . . . , 36 and corresponding message expansion. Hence, 
the complexity is 2 5 1|. We use a memory of 2 5 x 2 words. Similarly, in step El we 
compute pj <— R-jh (Pj+i, Wj) for j = 20, 19, . . . , 2 and 6 more steps for partial- 
fixing and partial-matching. Hence, the complexity is 2 4 ||. In stcp0 we compare 
the match of lower 4 bits of 2 9 (= 2 4 • 2 5 ) items. Hence, 2 5 results will remain. 
Complexity for step 0 is 2 5 ^ and the probability that all other bits match is 
2-252 jj ence) the number of remaining pair becomes 2 _247 (= 2 5 ■ 2 -252 ). So far, 
the complexity from stepElto 0is 2 5 j| + 2 4 ||-(-2 5 ^ = 2 5 ^^ ss 2 4 - 878 . In step0 
this is repeated 2 127 times and its complexity is 2 18 '- 878 . Step 0 is computed 2 120 
times. This takes 2 120 • (2 9 + 2 131878 ) « 2 251 9 . This is the complexity of the 
pseudo-preimage attack on SHA-256 43-steps. Finally, at Step 0 preimages are 
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found with a complexity of 2 1 +( 251878 + 256 )/ 2 = 2 254 939 « 2 254 9 . The required 
memory for finding a pseudo-preimage is 2 5 words and 2 5 x 2 words in Steps Q] 
and 01 which is 2 5 X 3 words. For finding a preimage, we need to store 2 19 
pseudo-preimages for unbalanced meet-in-the-middle. This requires a memory 
of 2 19 x 24 words. 


5.4 Attack on 42 Steps SHA-256 

When we attack 42 steps, We use 1-step IPM instead of 2-step PF in backward. 
This allows the attacker to use more message freedom. We choose l = 10 so 
that l and are balanced. Because each chunk has at least 10 free bits, the 
complexity for finding pseudo-preimages is approximately 2 246 (= 2 256 • 2 -10 ). 
The precise evaluation is listed in Table d 

6 Preimage Attack against 46 Steps SHA-512 

6.1 Basic Strategy for SHA-512 

For SHA-512, we can attack more steps than SHA-256 by using PF. This occurs 
by the following two properties; 

— Message-word size of SHA-512 is bigger than that of SHA-256. Hence, the 
bit-mixing speed of a and £ functions are slower than SHA-256. 

— The choice of three rotation numbers for the ao function is very biased. 

To consider the above, we determine to use the message freedom available to the 
attacker for applying PF as much as possible. 

Construction of the 4-step initial structure explained in Section 0] consumes 
a lot of message freedom. Therefore, we do not use the 4-step initial structure 
for SHA-512. Construction of the 3-step initial structure also needs a lot of 
message freedom. On the other hand, 2-step initial structure does not consume 
any message freedom because we do not have to control Ch functions. Finally, in 
our attack, we use a 31-step two-chunk including 2-step initial structure. Because 
construction of 2-step initial structure is much simpler than that of 4-step initial 
structure, we omit the detailed explanation of the construction. 


6.2 Chunk Separation 

The 31 message words we use are Wj to Wj+ 30- We apply the 2-step initial 
structure for Wj± 15 and Wj+w, hence the neutral words for the first chunk is 
Wj+ 16 and for the second chunk is Wj + 15. Whenever we change the value of 
Wj+ 16, we change the values of Wj + 7, Wj+e , . . . , Wj by message compensation 
technique so that the change does not impact to the second chunk. Similarly, 
whenever we change Wj+ 15, we change W 1+ i 7 . Wj+ 19, Wj +21 , Wj+ 22 , • • • , Wj+ 30. 
Finally, Wj to Wj+ 30 can form the 31-step two-chunks. 
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6.3 Partial-Fixing Technique 

We skip 6 steps in backward and 2 steps in forward by PF. Namely, we need to 
partially compute Wj-i,Wj- 2 , • • ■ , Wj- 6 independently of Wj+15, and partially 
compute Wj+ 31 and Wj+ 32 independently of Wj+ie, The equations for these 
message words are as follows. 

W j+ 15 , - a 1 (W j+1 3) - W i+8 + a 0 (Wj), 

W j+14 - Cn{W j+12 ) - Wj+ 7 + CT 0 ( Wj-!, ), 

Wj + 13 — (Tl(Wj+u) ~ Wj+6 + (To[ Wj. 2 ( g ), 
w j+12 - cn(Wj +1 o) - W j+5 + aoj Wj-3, J, 

W j+ u - a 1 (W j+ 9 ) - W j+ 4 + m( Wj - 4f J, 

Wy+io - < 7 i(W j+8 ) - Wy +3 + o-oC ^-s, J, 



Wj+ 3 i f g = CTi(Wj + 29) + Wj+24 + (To( Wj+ 16 f ) + Wj+15, 
Wj+32 f = CTi(Wj + 3o) + fkj+25 + CTo(Wj+n) + Wj + i6 f . 


Remember Table |21 If the lower l bits of input of cto is fixed, we can compute 
the lower l — 8 bits of its output. In backward, if we fix the lower l bits of ITj+is, 
the lower l bits of Wj- 1, the lower l — 8 bits of Wj- 2, the lower Z — 16 bits of 
Wj 3, the lower l — 24 bits of Wj- 4, the lower l — 32 bits of Wj-5, and the lower 
l — 40 bits of Wj ~ 6 can become independent of the second chunk. This results in 
computing the lower l bits of Hj- 1, the lower l — 8 bits of Hj- 2, the lower l — 16 
bits of Hj- 3, the lower l — 41 bits of Hj- 4, the lower l — 49 bits of Hj-s, and 
the lower l — 57 bits of Hj - 6 . Note that we also need to consider Id to compute 
Hj-4,, Hj- 5 , and Hj- 6. If we fix the lower l bits of Wj+ie, the lower 1 — 8 bits of 
IT, +31, and the lower l bits of Wj + 32 can become independent of the first chunk. 
This results in computing the lower l — 8 bits of Aj+ 32, and the lower l — 47 bits 

of A, +33. 

Therefore, if we choose l = 60 , we can match the lower 3 bits of Hj - 6 and 13 
bits of Aj + 33 after we skip 7 steps by the partial-matching technique. 


6.4 Attack Overview 

The attack uses 31 -step two-chunk Wj, . . . , ITj+so including 2 -step initial struc- 
ture. Apply PF for Wj-i,Wj - 2 , . . . , Wj- e, and IT, +31 , IT, +32. Finally, 46 steps 
are attacked by skipping additional 7 steps using partial-matching technique. 

39 steps (IT,_6 to Wj + 32) must be located sequentially. Because IT,+ 8 , Wj + g, 
Wj+ia, Wj+u,Wj+v2 : Wj + i3,Wj + u, Wj + 1 8 , Wj + 20 are the message words we fix 
in advance, we choose j = 6 so that W14 and IT15 can be chosen to satisfy message 
padding rules. The MSB of W13 can also be satisfied. In this chunk separation, 
Wj+7 can be described as Wj+7 = Const — Wj+ ie, where Const is a chosen fixed 
value and the lower l bits of Wj+ 16 are fixed. If we fix Const and the MSB of 
ITj+ie to 0 and some value, respectively, and choose the lower l bits of IT,+i6 
so that the MSB of —Wj+io does not change for all active bits of IT,+i6, we can 
always fix the MSB of Wj+7. 
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The number of free bits in Wj + is is 3. (I = 60 but we fix the MSB for 
satisfying padding for W\ 3 . ) The number of free bits in Wj+is is 4. Results 
from both chunks are compared with 3 bits. Therefore, the final complexity of 
pseudo-preimage attack is approximately 2 509 . This is converted to a preimage 
attack whose complexity is approximately 2 511 - 5 . For finding pseudo-preimages, 
this attack needs to store 2 3 items. Hence, the required memory is 2 3 x 9 words. 
For finding preimages, we need to store 2 15 pseudo-preimages for unbalanced 
meet-in-the-middle. This requires a memory of 2 15 x 24 words. 

6.5 Attack on 42 Steps SHA-512 

When we attack 42 steps, we stop using 1-step PF in forward and 3-step PF in 
backward. We choose l = 40. Because each chunk has at least 24 free bits, the 
complexity for finding pseudo-preimages is approximately 2 488 (= 2 512 ■ 2 -24 ). 
The precise evaluation is listed in Table d 

7 Remarks 

7.1 Length of Preimages 

The preimages are of at least two blocks, last block is used to find pseudo- 
preimages and the second last block links to the input chaining of last block. 
Two block preimages is only possible if we can preset the message words used 
for encoding the length (mi4 and mis for SHA-2) of last block according to the 
padding and length encoding rules. In our case, this can be done in the first step 
of the algorithm. On the other hand, we can leave mi4 and mis as random, later 
we can still resolve the length using expandable messages m. 


7.2 SHA-224 and SHA-384 

Our attack on 43 steps SHA-256 can also produce pseudo-preimages for SHA- 
224 by using the approach by Sasaki |23| ■ In our attack, we match 4-bits of A 3 7 
which is essentially equivalent to G43. Then, we repeat the attack until other 
registers randomly match i.e. we wait until A43, B43, . . . , F43 , and H43 randomly 
match. In SHA-224, the value of H43 is discarded in the output. Hence, we do not 
have to care the match of U43, which results in decreasing the complexity by 2 32 
bits. Hence, pseudo-preimages of SHA-224 can be computed with a complexity 
of 2 219 9 (= 2 2519 -2 -32 ). Note, this cannot be converted to a preimage attack on 
SHA-224 because the size of intermediate chaining variable is 256 bits. 

If we apply our attack on SHA-512 to SHA-384, W- 13 , W 14 , and W 15 wifi 
depend on neutral words. Hence, we cannot confirm 46 steps SHA-384 can be 
attacked or not because of padding problem. However, 43 steps SHA-384 can be 
attacked by using the same chunk as SHA-256. By considering the difference of 
word size and application of PF, we can optimize the complexity by choosing 
l = 27 so that l — 8 and are balanced. 
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7.3 Multi-preimages and Second-Preimages 

We note that the method converting pseudo-preimage to preimages can be fur- 
ther extended to find multi-preimages. We find first k block multi-collisions El, 
then follow the expandable message m to link to the final block. This gives 2 fc 
multi-preimages with additional fc2 n / 2 computations, which is negligible when 
k is much smaller than 2^ n ~ t ^ 2 (t denotes number of bits for each chunk, refer 
to Section BJ). We need additional 128fc bytes of memory to store the k block 
multi-collisions. Furthermore, most of the message words are randomly chosen, 
this attack naturally gives second preimages with high probability. Above multi- 
preimages are most probably multi-second preimages. 

8 Conclusions 

In this paper, we presented preimage attacks on 43 steps SHA-256 and 46 steps 
SHA-512. The time complexity of the attack for 43-step SHA-256 is 2 2549 and 
it requires 2 5 • 3 words of memory. The time complexity of the attack for 46- 
step SHA-512 is 2 511 ' 5 and it requires 2 3 • 9 words of memory. The number of 
attacked steps is greatly improved from the best previous attack, in other words, 
the security margin of SHA-256 and SHA-512 is greatly reduced. Because SHA- 
256 and SHA-512 have 64 and 80 steps, respectively, they are currently secure. 

An open question worth investigating would be to see if the current attacks 
may still be improved. Perhaps finding 15 + 4 + 15 pattern of chunks with 4-step 
initial structure in the middle or using better partial-fixing technique that would 
utilize middle bits of the message word would extend the attacks. 

The preimage attack we presented creates a very interesting situation for SHA- 
2 when a preimage attack, covering 43 or 46 steps, is much better than the best 
known collision attack, with only 24 steps. Our attack does not convert to collision 
attack because of the complexity above the birthday bound. However, we believe 
that the existence of such a preimage attack suggests that a collision attack of 
similar length could be also possible. In that light, the problem of finding collisions 
for reduced variants of SHA-256 definitely deserves more attention. 
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Abstract. We demonstrate how the framework that is used for creating 
efficient number-theoretic ID and signature schemes can be transferred 
into the setting of lattices. This results in constructions of the most ef- 
ficient to-date identification and signature schemes with security based 
on the worst-case hardness of problems in ideal lattices. In particular, 
our ID scheme has communication complexity of around 65, 000 bits and 
the length of the signatures produced by our signature scheme is about 
50, 000 bits. All prior lattice-based identification schemes required on the 
order of millions of bits to be transferred, while all previous lattice-based 
signature schemes were either stateful, too inefficient, or produced sig- 
natures whose lengths were also on the order of millions of bits. The 
security of our identification scheme is based on the hardness of finding 
the approximate shortest vector to within a factor of 0(n 2 ) in the stan- 
dard model, while the security of the signature scheme is based on the 
same assumption in the random oracle model. Our protocols are very 
efficient, with all operations requiring 0(n) time. 

We also show that the technique for constructing our lattice-based 
schemes can be used to improve certain number-theoretic schemes. In 
particular, we are able to shorten the length of the signatures that are 
produced by Girault’s factoring-based digital signature scheme i |1 ( II I 1 ll-fll ) . 


1 Introduction 

The appeal of building cryptographic primitives based on the hardness of lattice 
problems began with the seminal work of Ajtai who showed that one-way func- 
tions could be built with security based on the worst-case hardness of certain 
lattice problems 0 ■ Unfortunately, cryptographic primitives that were built with 
this very strong security property were extremely inefficient for practical appli- 
cations. For example, evaluating one-way and collision-resistant hash functions 
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required 0(n 2 ) time and space [2I24[ . and in public-key cryptosystems, the keys 
were on the order of megabytes j3E3EEH| (also see j25| for concrete parameter 
proposals for the scheme in El)- Therefore some new ideas were required in 
order to make provably-secure lattice-based primitives a realistic alternative to 
ones based on number-theory. 

A promising approach for improving efficiency is to use lattices that possess 
extra algebraic structure, and it is precisely this extra structure that makes the 
NTRU cryptosystem [Hj (which unfortunately does not have a proof of secu- 
rity) very efficient in practice. A step in the direction of building provably-secure 
lattice-based primitives was taken by Micciancio m, who showed that one could 
build efficient ( 0(n ) evaluation time) one-way functions with security based on 
the worst-case instances of problems pertaining to cyclic lattices (cyclic lattices 
are lattices that correspond to ideals in the ring Z[x]/(x" — 1)). This result was 
later extended to give constructions of collision-resistant hash functions by ei- 
ther restricting the domain [2H! or changing the ring m in Micciancio’s scheme. 
These works then led to constructions and implementations of collision-resistant 
hash functions m with security based on worst-case problems in lattices corre- 
sponding to ideals in Z[x]/(x" + 1) whose performance was comparable to the 
performance of ad-hoc hash functions that are currently in use today. And be- 
cause there is a very close connection between collision-resistant hash functions 
and more sophisticated primitives such as ID schemes and digital signatures, it 
was very natural to ask whether these primitives also had efficient lattice-based 
constructions. There has been some recent work in this direction, which we will 
now describe. 

Lyubashevsky and Micciancio constructed a one-time signature in which sign- 
ing and verification can be performed in time 0(n ) [19j . Using standard tech- 
niques, the one-time signature can be transformed into a full-fledged signature 
scheme using a signature-tree [21 12 2 j with only an additional work factor of 
0(log n). While this combination results in a very theoretically-appealing scheme 
where all the operations take time 0(n), it does require the use of a tree, which 
is a somewhat unwanted feature in practice. Another signature scheme was pro- 
posed by Gentry et al. in jOj. Their signature scheme follows the hash-and-sign 
paradigm, and when instantiated with algebraic lattices E3. verification takes 
time O(n), but 0(n 4 ) time is needed to do the signing (it is plausible that the 
signing time could be reduced to 0(n 2 ) with a more careful analysis). 

A different way of constructing digital signature schemes is to first construct 
an identification scheme of a certain form and then convert it to a signature 
scheme using the Fiat-Shamir transform [71.421 1 1 . The identification schemes of 
Micciancio and Vadhan |2E], Lyubashevsky [17, and Kawachi et al. na can 
all be instantiated such that the secret and public keys are of size 0(n), and 
the entire interaction takes 0(n) time as well. While these constructions seem 
essentially optimal, they contain a common inefficiency. The ID schemes all 
have the form of standard commit-challenge-response protocols (see Figure 0 
for an example of one where Y is the commitment, c is the challenge, and z is 
the response), and the inefficiency lies in the fact that for each challenge bit, 
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the response consists of 0(n) bits. Since the security of the protocol is directly 
connected to the number of challenge bits sent by the verifier, it means that for 
every bit of security, 0(n) bits need to be transmitted. Theoretically, this does 
not cause a problem because one only needs w(logn) bits of security in order for 
the protocol to be considered secure against polynomial-time adversaries, and 
then the total running time of the above protocols is still O(n). But in practice, 
this is a rather unsatisfactory solution because one wants some concrete security 
guarantee, say 80 bits, and then the communication complexity of the ID scheme 
will be about 80 times larger (the size of the signature in the derived signature 
scheme would be 160 times larger) than possibly necessary. This is in sharp 
contrast to number-theoretic ID schemes where the response of the prover is 
longer than the challenge by only a small factor. 

What allows number-theoretic ID schemes like Schnorr pa.GQ na, Girault 
ms, Okamoto M, etc. to be so “compact” is that the challenge string in these 
protocols is not treated as a sequence of independent 0’s and l’s, but instead 
the entire string is interpreted as an integer from a certain domain. This can be 
done because there is a lot of underlying algebraic structure upon which these 
schemes are built. On the other hand, lattices do not seem to have as much 
algebraic underpinning, and so the schemes based on them are very combinatorial 
in nature which is why the challenge strings are treated simply as a sequence of 
independent challenges much like in generic zero-knowledge proofs for NP. The 
main accomplishment of the current work is to show how to exploit the limited 
algebraic structure of ideal lattices in order to use the challenge bits collectively 
rather than individually, which ends up greatly improving the practical efficiency 
of lattice-based identification and signature schemes. 

1.1 Contributions and Comparisons 

Lattice-based constructions. We construct a lattice-based ID scheme in 
which the challenge string is treated as a polynomial in a certain ring, and one 
correct response to it from the prover is enough for authentication. The caveat is 
that some constant fraction of the time, the prover cannot respond to the chal- 
lenge from the verifier and must abort the protocol. The result of this is that the 
“commit” and “challenge” steps of the ID scheme now must be repeated several 
times to ensure that a valid prover is accepted with some decent probability. 
But using standard techniques, one can significantly shorten the length of the 
“commit” part of the protocol, and because of the structure of our scheme, the 
challenge can always be the same. Therefore the number of transmitted bits is 
dominated by the length of the single “response” . 

Even more optimizations are possible when converting the ID scheme into 
a signature scheme using the Fiat-Shamir transform. In the resulting signature 
scheme, there is of course no longer any interaction until the signer outputs the 
signature. And therefore there is no need for the signer to output the attempts 
in which he failed to sign (which correspond to the times he couldn’t answer the 
challenge in the ID scheme). So while the failures do cost time, the length of 
the final signature is as short as it would have been if the signer only attempted 
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to sign once and succeeded. And because the probability of failure is a small 
constant (« 2/3), we only expect to repeat the signature protocol 3 times before 
succeeding. 

All operations in our scheme take time 0(n) and we prove that the ID and 
signature schemes are secure based on the worst-case hardness of approximating 
the shortest vector to within a factor of 0(n 2 ) in lattices corresponding to ideals 
in the ring Z[x]/(x" + 1) (the security of the signature scheme is in the random 
oracle model). Compared to previous works, our asymptotic hardness assumption 
is the same as that in jl9H7j (although the scheme of [H3 is secure in the standard 
model), but is worse than that in j2QI9l37j (where the factor is 0(n 15 )) and in 
jT5] (where the factor is 0(n)). 

Based on the work of Gama and Nguyen jHJ who worked out the effectiveness 
of current state-of-the-art lattice reduction algorithms, we present some con- 
crete parameters with which our schemes can be instantiated. On the low end, 
the outputted signatures are about 50000 bits in length (the ID scheme requires 
about 65000 bits to be transmitted). While the scheme of [03 has better asymp- 
totic security, the response to each challenge bit seems to require at least 10000 
bits. So if we would like the challenge to be 160 bits for security purposes, the 
response (and therefore the signature size) will be over a million bits. The signa- 
ture schemes of m and m look like they would have their signatures be about 
160 times longer than ours (the ID schemes would require communications that 
are about 80 times longer), again because the responses are done separately to 
every challenge bit. So even though our ID and signature schemes have worse 
asymptotic security, their structure makes them much more practically efficient. 

At this point it is not possible to give an accurate comparison of our signa- 
ture scheme to the hash-and-sign signature schemes j9!37j because no concrete 
parameters were given for those schemes. But independent of the signature sizes, 
our scheme will still have the advantage in that signing can be done in time 0(n) 
rather than 0(n 4 ). 

The signature length of the one-time signature in [HI may actually be a 
little shorter than in our scheme, but this advantage is lost when the one-time 
signature gets converted to a general stateless signature scheme. If a signature 
tree is used in the conversion, then the signature length may go up by a factor 
of the tree depth, which would make it much less efficient. On the other hand, 
one could build a hash tree using any collision-resistant hash function, and then 
the signatures would only increase by the product of the tree depth and the 
hash function output. If the scheme is to be completely stateless and support 
about 2 60 signatures, and we use SHA-256 as the hash function, then the length 
of the one-time signature in m would increase by about 15,000 bits, which 
would make it somewhat longer than our signature. The similarity between the 
signature sizes of our scheme and the scheme in m is no coincidence, and we 
further discuss the relationship between the two schemes in Section 11.21 

Factoring-based signatures. We show that the ideas used to construct our 
lattice-based digital signature can also be used for shortening the length of some 
number-theoretic schemes. The signature scheme originally proposed by Girault 
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m, and analyzed in men is a scheme whose security, in the random oracle 
model, is based on the hardness of factorization. What is particularly attrac- 
tive about it is that if the signer can do some pre-computing before receiving 
the message, then signing can be done with just one random oracle query, one 
multiplication, and one addition over the integers (no modular reduction is re- 
quired). We show how to reduce the length of the signature in an instantiation 
of the scheme due to Pointcheval jS3] from 488 bits to 425. 

1.2 Techniques 

There is a pattern that emerges when looking at constructions of certain ID 
and signature schemes based on the hardness of factoring and discrete log. The 
informal chain of reductions from the hard problem to the signature scheme 
looks as follows: 

Hard Problem < CRHF < One-time signature < ID scheme < Signature 

For example, finding collisions in the hash function h(x) = g x mod N implies 
being able to factor N. This can be converted into a one-time signature with the 
secret key being some pair of integers (x,y), public keys being h(x), h(y), and 
the signature of a message c being xc + y. The one-time signature can then be 
converted into an ID scheme by simply picking a new y every time (Figure Q) 
and c now being a challenge chosen by the verifier. The ID scheme can then 
be converted to a signature scheme by using the Fiat-Shamir transform which 
replaces the verifier with a random oracle (Figure EJ) pnillldlj . The same idea 
can be used with the hash function h(x i, £ 2 ) = (fli 1 ^ 2 m °d p)i in which finding 
collisions implies solving the discrete log problem. The ID and signature schemes 
resulting from that hash function are due to Okamoto m 

It turns out that a somewhat similar approach can be used to build lattice- 
based primitives as well. The works of f‘2DI I ?Sj , showed a reduction from the worst- 
case problem of finding short vectors in algebraic lattices to finding collisions in 
hash functions. The work of [HJ can be viewed as a transformation of the hash 
function to a one-time signature, and this current work can then be seen as the 
continuation of this chain of reductions where the one-time signature of HO] is 
converted into an ID scheme and then into a signature scheme. 

But what prevents the techniques used in number-theoretic schemes to be 
directly extended to lattice-based ones, is that lattices allow for much less alge- 
braic structure. For example, the domains in number-theoretic hash functions are 
rings, while in lattice-based ones, the domain is just a subset of a ring (in partic- 
ular, those elements in the ring that have small Euclidean norm) that is neither 
closed under addition nor multiplication. This is very related to the fact that 
the factoring and discrete log problems can be reduced to finding an element in 
the kernel of some homomorphic function, while finding short vectors in lattices 
reduces to the problem of finding small elements in the kernel of a homomor- 
phism. This difference is what seems to give lattice problems resistance against 
polynomial-time quantum algorithms that solve factoring and discrete log |3fl1 . 
but at the same time it also hinders constructions of lattice-based primitives. 
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Secret key: s D s 

Public key: N, g, and S <— g s mod N 


Prover 

y 4- D y , Y <— g y mod N 




Verifier 

c^-D c 


Accept iff g z = YS c (modN) 


Fig. 1. Factoring-Based Identification Schemes. The parameters for this scheme 
are in Figure El The line in [ ] is only performed in the aborting version of the scheme. 


In overcoming this limitation, the one-time signature of EES! had to leak parts 
of its secret key. While it wasn’t a problem in that setting because the secret key 
is only used once, in ID schemes the same secret key is used over and over, and 
so leaking a part of the secret key every time would result in complete insecurity. 
In this paper, we solve this difficulty by using an aborting technique that was 
introduced in m The idea behind aborting is that the prover can elect to abort 
the protocol in order to protect some information about his secret key (mainly, 
the protocol needs to remain witness-indistinguishable). In this work, we are 
able to relax the conditions that were needed for witness-indistinguishability in 
ini. and this allows us to construct much more efficient lattice-based protocols 
as well as extend the technique to other contexts, such as the factoring-based 
scheme described in Section II .11 We essentially show that all that is needed for 
the aborting technique to be applicable is a collision-resistant homomorphic hash 
function that has small elements in its kernel. We believe that this technique can 
find further applications. 


1.3 Intuition for Aborting 

Understanding where aborting might be useful is best accomplished with an 
example. Consider the protocol in Figure[0(for this discussion, it is not necessary 
to understand why the protocol works), which has the form of a typical 3-round 
commit-challenge-response ID scheme. The secret key is some s and the public 
key is h(s ) where h is a function that happens to be h(s ) = g s mod N in our 
example. In the first step of the protocol, the prover picks a parameter y, and 
sends h(y), to the verifier. The verifier picks a random “challenge” c, and sends 
it to the prover. The third step of the protocol consists of a response of the 
prover to the challenge. This response must somehow use the secret key, and in 
our example, the secret key s is multiplied by c and then added to y. Notice that 
sending sc without adding it to y would completely reveal s, and so the job of 
y is to somehow mask the exact value of sc. If the operation sc takes place in 
some finite group, then a natural idea for masking would be to pick y uniformly 
at random from that group. The intuition is that if nothing about y is known, 
then the value y + sc is also completely random (of course, something is known 
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about y when the prover sends h(y) to the verifier, but we gloss over that here). 
And this is exactly what is done in well-known ID schemes such as Schnorr 
GQ mg, Okamoto j23|, etc.. 

But sometimes it is infeasible to pick y uniformly at random from the group. 
In Girault’s ID-scheme f 1 1)11 I (Figured): the multiplication sc is performed 
over the integers, which is an infinite group. A way to do masking in this scheme 
is to pick aj/ina range that is much larger than the range of sc. So for example, 
if 0 < sc < A, then one could pick a random y from the range [0, ..., 2 64 A]. Then, 
with very high probability, the value of sc + y will be in [A, ..., 2 64 A], in which 
case it will be impossible to determine anything about sc if nothing is known 
about y. 

In constructing our lattice-based ID scheme, the same difficulty is encoun- 
tered as in Girault’s scheme, except we do not have the luxury of picking y 
(or something analogous to y in the lattice-based scheme) from such a large 
range because doing so would require us to make a much stronger complexity 
assumption which would significantly decrease the efficiency of the protocol (we 
would have to assume that it is hard to find a super-polynomial approximation 
of the shortest vector instead of just an 0(n 2 ) approximation). Our solution is 
to instead pick y from a much smaller set, something analogous to [0, . . . , 2A], 
but only reveal sc + y if it falls into the range [A, . . . , 2 A] . If the range is picked 
carefully and the function h is a homomorphism that has “small” elements in 
its kernel, then one can show that if the prover only reveals values in this range 
and aborts otherwise, the protocol will be perfectly witness-indistinguishable. 
The witness-indistinguishability is then used to prove security of the protocol by 
showing that a forger can be used for extracting collisions in h. 

The same technique can also be applied to Girault’s scheme. Notice that 
if we pick y uniformly at random from the range [0, ...,2A] instead of from 
[0, ..., 2 64 A], the length of sc + y will be 63 bits shorter. We point out that our 
aborting factoring-based ID scheme in Figure □ which uses this idea is actually 
worse than the corresponding non-aborting one because the savings gained by 
shortening sc + y are lost in case the prover has to abort and the ID protocol 
has to be repeated. But the advantage of aborting does show itself when the ID 
protocol is converted into a signature scheme using the Fiat-Shamir paradigm 
(Figure E|). In a signature scheme, there is no interaction, and therefore there 
is no need for the signer to ever include the aborted signing attempts into the 
final signature. So if the signer needs to abort, he simply reruns the protocol 
until he gets a signature in the correct range. The end result is that the eventual 
signature is shorter than it would have been in schemes such as f 1 1)11 I I.Tl] where 
the signer does not have the option to abort. 

2 Preliminaries 

2.1 Notation 

We will denote vectors by bold letters. For convenience, vectors of vectors will 
be denoted by a bold letter with a hat. For example, if ai,a 2 are elements of 
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Z n , then we can write a = (ai,a2). The tea norm of a is written as Halloo, and 
Halloo for a = (ai, . . . ,a m ) is defined as maxi(||ai||oe>). If S is a set, then a <— S 
means that a is chosen uniformly at random from S. All logarithms are assumed 
to be base 2. 


2.2 Lattices and Algebra 

An integer lattice A is a subgroup of Z". The approximate Shortest Vector 
Problem (SVP 7 (A)) asks to find a vector v in A such that ||v||oo is no more than 
7 times larger than the vector in A with the smallest £qo norm. In this work, we 
will be interested in lattices that exhibit an additional algebraic property - in 
particular, they correspond to ideals in the ring Z[x]/(x" + 1). We will say that 
a lattice A is an (x n + l)-cyclic lattice if for every vector (no, ... , v n - 2 ,v n -i) G 
A, the vector (— w n _i,wo, . . . , n n - 2 ) is also in A. If we look at the vectors as 
polynomials (i.e. (no, . . . , n„_ 2 , n„_i) as no + ... + n n _ 2 X n_2 + n„_ix n_1 ), then 
an (x n + l)-cyclic lattice is an ideal in Z[x]/(x" + 1} because in this ring, 

(n 0 + . . . + n„_ 2 x" -2 + Un-ix" -1 ) • x = -n„_i + n 0 x + . . . + %_ 2 x n-1 . 

The ring that will be most important to us throughout the paper is the ring 
Z p [x]/(x" + l) where p is some odd positive integer. The elements in Z p [x] / (x ri + 
1} will be represented by polynomials of degree n — 1 having coefficients in 
the range [— Throughout the paper, we will treat polynomials in 
Z p [x]/(x" + 1} and vectors in Z” as the same data type. So when, for example, 
we talk of multiplying two vectors, we actually mean converting the vectors to 
polynomials and then multiplying the polynomials in Z p [x]/(x" + 1). Similarly, 
the nornQ of a polynomial is just the norm of the corresponding vector. It’s not 
hard to see that for polynomials v,w G Z p [x]/(x" + 1), the following relation 
holds: 

(x" + l)-cyclic lattices are a particular class of lattices that received attention 
because one can construct efficient and provably secure cryptographic primitives 
based on the hardness of finding approximate short vectors in these lattices 
1 1 S I 2 11 1 1 11120 1 . The main reason for this efficiency is that the multiplication of 
two polynomials in Z p [x]/(x" + 1) can be done in time 0(n ) using the Fast 
Fourier Transform. While the results in this paper can be applied to lattices 
that correspond to ideals in other rings, it would only unnecessarily complicate 
matters because the ring Z[x]/(x" + 1} seems to be the most useful theoretically 
and in practice. 

While a lot is known about the complexity of SVP in general lattices, very 
little is known about this problem when restricted to ideal lattices. Nevertheless, 
the problem is related to some problems in algebraic number theory (see jl8l.‘>()l ) 

1 This is a slight abuse of the word norm. Because of the reduction modulo p, it’s not 

true that for any integer a we have ||aa||oo = |a|||a||oo, but it still holds true that 
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that do not have any efficient solution. And it seems that the currently best 
lattice algorithms are unable to take advantage of the extra structure provided 
by ideal lattices. Therefore, it still seems that solving SVP 7 takes time 2°^ 
when 7 = n°^ j 1 6l4j . 


2.3 Lattice-Based Collision-Resistant Hash Function 

Let R be the ring Z p [x]/(x n +l). We define the following family of hash functions: 

Definition 1. For any integer m and D C R, the function family 
H(R, D, m) mapping D rn to R is defined as 

H(R, D, m) = {ha : a £ R m }, where for any z £ D m , ha(z) = a • z 

That is, if a = (ai, . . . , a m ) and z = (zi, . . . , z m ), then ha(z) = aiZi + . . .+a m z m 
where all the operations are performed in the ring Z p [x]/(x" + 1). It’s not hard 
to see that the hash functions in R(R, D. m ) satisfy the following two properties 
for any y,z £ R rn and c £ R: 


h{ y + z) = h(y) + h(z) 

(1) 

h(yc) = h( y)c 

(2) 


The collision problem Col(h, D) is defined as follows: 

Definition 2. Given an element h £ R(R, D, m), the collision problem 
Col(h,D), where D C R, asks to find two distinct elements z,z' £ D such 
that h( z) = h(z'). 

In HH|, it was shown that when D is some restricted domain, solving the 
Col(h, D) problem for random h £ 7i(R. D, m) is as hard as solving SVP 7 
for any (x n + l)-cyclic lattice. 

Theorem 1. Let R = Z p [x]/(x" + 1) be a ring where n is any power of 2, and 
define D = {y £ R : ||y||oo < d} for some integer d. Let H(R,D,m) be a hash 
function family as in Definitional such that m > and p > 4dmn 15 logn. 

If there is a polynomial-time algorithm that solves the Col(h, D) problem for 
a random h £ H(R,D,m) with some non-negligible probability, then there is 
a polynomial-time algorithm that can solve SVP 7 (A) for every (x" + 1 )-cyclic 
lattice A, where 7 = 16dmn log 2 n. 


2.4 Cryptographic Definitions 

Digital Signatures. We recall the definitions of signature schemes and what 
it means for a signature scheme to be secure. 
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Definition 3. A signature scheme consists of a triplet of polynomial-time (pos- 
sibly probabilistic) algorithms ( G , S, V ) such that for every pair of outputs ( s , v) 
of G( 1") and any n-bit message m, 

Pr[V(v,m,S(s,m)) = 1] = 1 

where the probability is taken over the randomness of algorithms S and V. 

In the above definition, G is called the key-generation algorithm, S is the signing 
algorithm, V is the verification algorithm, and s and v are, respectively, the 
signing and verification keys. 

A signature scheme is said to be secure if there is only a negligible probability 
that any forger, after seeing signatures of messages of his choosing, can sign a 
message whose signature he has not already seen 02 - 

Definition 4. A signature scheme (G, S, V ) is said to be secure if for every 
polynomial-time (possibly randomized) forger T , the probability that after seeing 
the public key and {(pi, S(s, pi )), . . . , (p q , S(s, p q ))} for any q messages /x* of 
its choosing (where q is polynomial in n), T can produce (p ^ Pi,c) such that 
V(v,p,a) = l, is negligibly small. The probability is taken over the randomness 
of G, S, V, and T. 

In the standard security definition of a signature scheme, the forger should not be 
able to produce a signature of a new message. A stronger notion of security, called 
strong unforgeability requires that in addition to the above, a forger shouldn’t 
even be able to come up with a different signature for a message whose signature 
he has already seen. The schemes presented in this paper satisfy this stronger 
notion of unforgeability. 

Identification Schemes. An identification scheme consists of a key-generation 
algorithm and a description of an interactive protocol between a prover, pos- 
sessing the secret key, and verifier possessing the corresponding public key. In 
general, it is required that the verifier accepts the interaction with a prover who 
behaves honestly with probability one, but this definition can be relaxed so that 
sometimes an honest prover is not accepted with some small probability. 

The standard active attack model against identification schemes proceeds in 
two phases jS|. In the first phase, the adversary interacts with the prover in an 
effort to obtain some information. In the second stage, the adversary plays the 
role of the prover and tries to make a verifier accept the interaction. We remark 
that in the second stage, the adversary no longer has access to the honest prover. 
The adversary succeeds if he is able to make an honest verifier accept with some 
non-negligible probability. 

Witness-Indistinguishability. We will only define the concept of witness- 
indistinguishability in a way that pertains to our application and we refer the 
reader to [OJ for the more general definition. For convenience, we will use the 
notation from the identification protocol in Figure d An identification scheme 
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is said to be perfectly witness-indistinguishable if for any public key S , and any 
two valid secret keys s, s' (i.e. s, s' £ D s and g s mod N = g s mod N = S ), the 
view of any (possibly malicious) verifier in the interaction where the prover uses 
s has the exact same distribution as the view where the prover uses s' . In other 
words, it is impossible for the verifier to figure out which of the valid secret keys 
the prover is using to authenticate himself. 

3 Lattice-Based Constructions 

In this section, we present our lattice-based identification (Figure EJ) and sig- 
nature (Figure EJ schemes. In Figure 0 we define all the parameters that will 
appear in this section as well as give some concrete instantiations. The parame- 
ter k controls the size of the domain from which the challenges /signatures come 
from. In order to have soundness error of at most 2 -80 , this parameter must be 
set such that the size of this domain is 2 160 . The parameter p is chosen such that 
every public key has a very high probability of having multiple corresponding 
secret keys associated with it. The free parameters n, m , and a need to be set 
in a way so that it is computationally infeasible find collisions in the underlying 
hash function family H(R, D,m ). 

The last two lines of the above table deal with the practical cryptanalysis 
of our signature scheme. The last line of the table specifies the length of the 
shortest vector in a certain lattice defined by our signature scheme that can be 
found in practice, while the fine above that specifies the length of the vector that 
needs to be found in order to forge a signature. See Section EH for more details. 


Parameter 

Definition 

Sample Instantiations 

n 

integer that is a power of 2 

512 

512 

512 

1024 

m 

any integer 

4 

5 

8 

8 

a 

any integer 

127 

2047 

2047 

2047 

K 

integer s.t. 2*(™) > 2 leu 

24 

24 

24 

21 

P 

integer « (2a + l) m • 2“ ^ 

2 31 ' 7 

2 59.8 

2 95.8 

2 95.9 

R 

ring Zp[x]/(x" + 1) 





D 

(g £ R : Uglloc < mnan} 





D s 

{g G -R = llglloo < crj 





D c 

{geR: ||g||i<K} 





Dy 

(g £ R : ||g||oo < mnan} 





G 

{g £ R : ||g||oo < mnaK - an} 





Signature Size 

« mn log (2 mnaK) bits 

49000 

72000 

119000 

246000 

Public Key Size 

« nlogp bits 

16000 

31000 

49000 

98000 

Secret Key Size 

m mn log (2cr + 1) bits 

16000 

31000 

49000 

98000 

Hash Function Size 

« mn log p bits 

65000 

153000 

392000 

786000 

Length of vector needed to break signature 

2"° 

2^.9 

2"' D 


Length of shortest vector that can be found 

2^-° 

2 M -‘ 

T ,.° 

2 DM “ 


Fig. 2. Lattice-Based Schemes’ Parameter Definitions and Sample Instantiations 


Fiat-Shamir with Aborts 


609 


3.1 Identification Scheme 

The secret key of the prover, denoted s, consists of a set of rn polynomials 
from the set D s which are picked uniformly at random. The public key of the 
prover consists of a hash function h which is picked randomly from the family 
H(R, D, m), and the polynomial S = h( s). We point out that it is not necessary 
for every prover to have a distinct h. If trusted randomness is available, then 
everyone can share the same random h which considerably lowers the public 
key size because the hash function h can be hard-coded into the signing and 
verification algorithms. 

In the first step of the protocol, the prover picks a random y £ D™' , and 
“commits” to it by sending Y = h( y) to the verifier. The verifier then picks 
a random challenge c from D c and sends it to the prover. The prover then 
computes z = sc + y. If this result falls into the range G m , the prover sends it 
to the verifier. Otherwise, he aborts the protocol. Upon receiving z, the verifier 
accepts the interaction if z £ G m and h(z) = Sc + Y. Using the homomorphic 
properties of h (see (UJ and (0), we see that h( sc + y) = Sc + Y, and so an 
honest prover who does not abort will always be accepted. 

Proving the soundness and completeness of the protocol is done using the 
following series of steps: 

1. Show that an honest prover is accepted with probability 1/e. 

2. Show that the ID scheme is perfectly witness-indistinguishable. 

3. Show that with probability 1 — 2 -128 , for a randomly-picked s £ D™, there 
is another s' £ D™ such that h( s) = h(s'). 

4. Show how to extract a collision in h from an adversary who succeeds in 
breaking the protocol 

Step 1 shows that the completeness of the protocol is 1/e. We will explain 
how to increase this number later. Step 2 is essentially the main part of the 
proof, which shows that for every pair of possible secret keys si's 7 such that 
S = h( s) = h( s'), no adversarial verifier can determine which secret key is being 
used by the prover. The reason for this is that we have set up the parameters 
so that for every secret key s £ D" L , every challenge c G D c , and every response 
z £ G m , the value of z — sc is in D y . This implies that having seen the history 
(Y,c,z), it is impossible to tell whether the secret key was s and we picked 
a masking parameter y, or the secret key was s' and we picked the masking 
parameter y 7 = z — s'c = y + sc — s'c = y + (s — s')c because h( s) = h(s') = S 
and h( y) = h{ y') = Y. 

To make the claim in step 2 non- vacuous, we need to show that for a randomly 
picked secret key, there is indeed a high probability that another secret key exists 
which produces the same public key. This is done in step 3. 

In step 4, we show how to use a successful adversary to solve the Col(h, D) 
problem for a random h G H(R, D, m). Given a random h G 'H(R. D, m), we pick 
a random secret key s and publish the public keys h and S = h( s). In the first 
stage of the attack, the adversary plays the role of the verifier, and we are able 
to perfectly play the part of the prover because we know the secret key. In the 
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Private key: 9 A- D'f 

Public key: h H{R, D,m),S <— h( s) 

Prover Verifier 

y'CY- ft(y) — X — _ 

« c c4-p c 

z <- sc + y 

if z </ G m then z <— _L g r 

Accept iff z 6 G' m and h(z) = Sc + Y 
Fig. 3. Lattice-Based Identification Scheme 

second stage when the adversary attempts to impersonate the prover, we receive 
his commitment, and send a random challenge c £ D c . After he responds with z, 
we rewind and pick another random challenge c' G D c , to which the adversary 
will respond with zl . The responses of the adversary and our knowledge of the 
secret key allow us to obtain the equation h( z — sc) = h(z' — sc'). By our choice 
of parameters, both z — sc and zl — sc' are in D, and because of the witness- 
indistinguishability of the protocol, the adversary cannot know our exact secret 
key. Therefore with probability at least 1/2, z — sc and zl — sc' will be distinct 
and we have a collision for h. Thus an adversary who can break the ID scheme 
can be used to solve Col(/i, D) for random h £ Ti(R. D. m), and by Theorem 0 
this implies finding the approximate short vector in all (x n + l)-cyclic lattices. 

Theorem 2. If the identification scheme in Figure 0 is insecure against active 
attacks for the parameters in Table 0 then there is polynomial-time algorithm 
that can solve SVP 7 (vl) for 7 = 0(n 2 ) for every lattice A corresponding to an 
ideal in the ring Z[x]/(x" + 1). 

Notice that the ID scheme is not quite satisfactory because a valid prover is 
only accepted with probability 1/e. This means that the scheme may have to 
be repeated several times until the prover succeeds. Because we showed that the 
scheme is witness-indistinguishable, the repetitions can be performed in parallel, 
and the witness-indistinguishability property will still be preserved 0. So the 
straight-forward way to modify the ID scheme would be, for example, to pick 30 
different yfis and send the Y,; = h(yi) to the verifier. Then the verifier will send 
30 challenges, and the prover replies to the first one of these challenges that he 
can. This would result in a protocol where the honest prover is accepted with 
probability about 1 — 2 -20 . 

But there are some significant improvements that can be made. First of all, 
the verifier needs to send only one challenge, rather than one challenge for every 
commitment (this is because we show that for every challenge c, the probability 
of aborting is equal over the random choice of y). And secondly, we can use a 
standard trick to shorten the length of every Y *, which will result in large savings 
in our protocol because the length of each Y is approximately n log p bits, which 
could be as large as 100,000 bits! Instead of sending Y, we can send H(Y) where 
H is any collision resistant hash function. Unlike with h, we will not need H to 
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Signing Key: s A- D™ 

Verification Key: h^-H(R,D,m), S <— h{ s) 
Random Oracle: H : {0, 1}* — > D c 

Si cm f ii h 'si 



Verify (/qz, e, h, S) 
1: Accept iff 


2: e<-H(ft(y), M ) 

3: z <- se + y 

4: if z ^ G m , then goto step 1 
5: output (z, e) 


H(*(y), /*) 


z 6 G m and e = H(/i(z) - Se, At) 


Fig. 4. Lattice-Based Signature Scheme 


have any algebraic properties like (0 and (0, so H could be a cryptographic 
hash function such as SHA or an efficient lattice-based hash function from m 
whose output is about 512 bits. So sending 30 H( Y)’s will only require about 
15,000 bits in total. In this modified protocol, the verifier’s challenge and the 
prover’s reply remain the same as in the old protocol. But to authenticate the 
prover, the verifier checks whether z G G TO and that H(h( z) — Sc) is equal 
to some H( Y) sent by the prover in the first step 0 . It can be shown that an 
adversary who breaks this protocol can be used to find a collision either in H or 
in h. We will give more details in the full version of the paper. 

3.2 Signature Scheme 

Our signature scheme is presented in Figure El The public and secret keys are just 
like in the ID scheme. To sign a message p, we pick a random y and compute 
e = H(/i(y), /i) and send (z, e) as the signature only if z is in the set G m . 
Otherwise we repeat the procedure until z ends up in G m . The probability that 
we succeed in getting z to be in G m on any particular try is the same as the 
probability that the ID scheme in Figure 0 doesn’t send _L, which is 1 /e. So we 
expect to repeat the signing procedure less than 3 times to get a signature. 

The witness-indistinguishability of the signature scheme follows directly from 
the witness indistinguishability of the ID scheme because the challenge is now 
simply generated by a random oracle rather than by the verifier. The proof 
of security of the signature scheme uses the forking lemma E 2 to obtain two 
signatures from a forger that use the same random oracle query. Then using the 
same ideas as in the security proof of the ID scheme, it can be shown how to 
use these signatures to obtain a solution to the Col(h, D) problem for a random 
h G H(R,D,m). 

Theorem 3. If the signature scheme in Figure for the parameters in Table 0 
is not strongly unforgeable, then there is a polynomial-time algorithm that can 
solve SVP 7 (A) for 7 = 0(n 2 ) for every lattice A corresponding to an ideal in 
the ring Z[x]/(x” + 1). 

2 One could lower the communication complexity even 
hashes into a hash tree. 


further by combining the 30 
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3.3 Concrete Parameters 

The security of our ID (and signature) scheme depends on its soundness and 
the hardness of finding collisions in hash functions from a certain family. As 
mentioned earlier, we set the parameters k and p such that the soundness error 
is at most 2 -80 . We now discuss how to set the remaining parameters so that 
finding collisions in the resulting hash function is infeasible with the techniques 
known today. For this, we will use the work of jH], who showed that, given 
a reasonable amount of time, algorithms for finding short vectors in random 
lattices produce a vector that is no smaller than 1.01" times the shortest vector 
of the lattice. 

We showed that an adversary who succeeds in forging a signature can be 
used to find a collision in a hash function chosen randomly from H(R, D,m). 
This is equivalent to finding “short” vectors a certain lattice which we will now 
define. For a polynomial a £ Z p [x]/(x" + 1), let Rot (a) be the n X n matrix 
whose i th column is the polynomial ax’ , and let A be the n x nm matrix A = 
[i2ot(ai)||i2ot(a2)|| • . . ||f?ot(a m )j where a* are the polynomials which define the 
hash function h. If we define the lattice (A) = {u€ Z m " : Au = 0(modp)}, 
then finding a vector u £ (A) whose norm is at most 2mnon is equivalent 

to finding a collision in h £ H(R, D, m). 

The random lattices on which the experiments of jS] were run differ from 
Ap (A) , but in |2ol . experiments were run on lattices that are very similar 0 
to Ap (A) which obtained the same results as jHj- Furthermore, it was shown 
in m that it is inefficient to try to find a short vector in A^ (A) by using 
all its ran dimensions. Rather, one should only use the first i Jn log p/ log 1.01 
dimensions and zero out the others. This results in a vector whose £2 length is 
min{p, 2 2 v' n iogpiogi.01^ arK j w hose l 0 0 norm is at least 

min{p, 2 V»iogP log 1.01 . (nlogp/logl.01)- 1 / 3 4 } (3) 

Since solving the Col(/i, D) problem is equivalent to finding an element y such 
that h( y) = 0 and ||y||oo < 2mnon, we want to make sure that when we set 
our parameters, the value of 2 mnon is smaller than the value in ©. In the 
instantiation of the scheme that produces a signature of length approximately 
49000 bits, the value of 2mnan is around 2 23 5 , while the value of the shortest 
vector (in the £oo norm) that can be found according to © is around 2 25 5 (see 
the last two lines of the table in Figure EJ) • 

We hope that our work provides further motivation for studying lattice- 
reduction algorithms for lattices of the form A^ (A), which also happen to be 
central to the cryptanalysis of other lattice-based schemes such as f2l)ll9H5| . 

3 The lattices in were just like Ap ( A) , except each entry of A was chosen uniformly 
at random modulo p. Since the currently best lattice-reduction algorithms don’t 
“see” the algebraic structure of the lattice, it is very reasonable to assume that their 
performance will be the same on our lattices and the lattices in j25J- Of course it’s 
possible that a different algorithm that has yet to be discovered will be able to use 

the algebraic structure of A to achieve better results. 
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4 Factoring-Based Constructions 

We now present a modification of a signature scheme presented in [21] whose 
security is based on the hardness of factoring . We will need the following two 

definitions from 1311 - 

Definition 5. A prime p is said to be a-strong if p = 2r + 1 where r is an 
integer whose prime factors are all greater than a. 

Definition 6. Let N = pq, where p and q are primes. Then an element g € h* N 
is said to be an asymmetric basis if the parity of ord(g) in Z* differs from the 
parity of ord(g) in Z*. 

Both schemes are presented in Figure 0 (our scheme only differs from that in 
m by the addition of line 4), and the parameters in m as well as our modified 
parameters are presented in Figure El We point out that the scheme of jS3] is a 
variant of Girault’s scheme m, and our technique of shortening the signature 
length would apply equally well to all its variants fl 013111 1] as well as to the 
blind signature constructed in EH- 

The signature of a message p consists of the pair (z, e). The length of z in 
the non-aborting version of the protocol has length k + k' + log a = 360, while 
in our protocol the length is k + 1 + log a = 297. The savings are essentially due 
to the fact that we can pick y in a much smaller range, and the fact that we are 
allowed to abort keeps the scheme secure. 

If in step 4, 2 is not in G, then the signing procedure has to be repeated. 
It can be shown that this happens with probability 1/2. So we expect to run 
the signing protocol twice for every signature. But if we assume that off-line 
computations (i.e. computations before receiving the message) are free, then we 
can change the protocol so that we expect to compute just one extra random 
oracle query over the non-aborting signature scheme. The way to do this is to 
always keep several pi and g Vi mod N stored along with the ranges that e would 
have to fall into so that se + pi € G (the range is just (G — yf) / s). Then when we 
are asked to sign a message p, we compute e = I \{g Vl mod N, p) and then check 


Secret Key: s D s 

Public Key: N, g, and S <— g“ mod N 

Random Oracle: H : {0, 1}* — ► D c 

Sign {p.N,g,s) 

1: y^D y 

2: e <— H (g y mod N, p) 

3: z * — se + y 

4: [if z ^ G, then goto step 1] 

5: output (z,e) 


Verify(/i, z, e, N, g , S ) 

1: Accept iff e * H (g z S~ e mod N , p) 


Fig. 5. Factoring-Based Signature Schemes. Line 4 is only executed in the abort- 
ing scheme. 
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Without Aborting| With Aborting 

k 

128 

N 

1024-bit product of two 2' c -strong primes 

9 

asymmetric basis in 1A N such that ord(g) has 160 bits 

a 

2 io» 

D c 

{«»,..., 2*} 

D s 

(0 o-} 

k' 

64 


Dy 

(0, . . . , 2 k+k ' a} 

(0, . . . , 2 fe+1 - cr} 

G 


{2 fc • cr, . . . , 2 fc+1 • a} 

Signature Size (bits) 

488 

425 


Fig. 6. Factoring-Based Scheme’s Variable Definitions 


whether it’s in the valid range of y\. If it is, then we compute sc+yi and output 
it. If it’s not, then we recompute e using y%, and so on. The important thing to 
note is that we only compute sc + yi once, and we still expect to succeed after 
two tries. As an added bonus, we only use up one yi per message, since the yi 
that “didn’t work” can be safely tried for the next message. 

Theorem 4. An adversary who breaks the aborting signature scheme in T steps 
can be used to factor N in poly(T) steps. 
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Abstract. We describe public key encryption schemes with security 
provably based on the worst case hardness of the approximate Shortest 
Vector Problem in some structured lattices, called ideal lattices. Under 
the assumption that the latter is exponentially hard to solve even with a 
quantum computer, we achieve CPA-security against subexponential at- 
tacks, with (quasi-)optimal asymptotic performance: if n is the security 
parameter, both keys are of bit-length O(n) and the amortized costs of 
both encryption and decryption are 0(1) per message bit. Our construc- 
tion adapts the trapdoor one-way function of Gentry et al. (STOC’08), 
based on the Learning With Errors problem, to structured lattices. Our 
main technical tools are an adaptation of Ajtai’s trapdoor key genera- 
tion algorithm (ICALP’99) and a re-interpretation of Regev’s quantum 
reduction between the Bounded Distance Decoding problem and sam- 
pling short lattice vectors. 


1 Introduction 

Lattice-based cryptography has been rapidly developing in the last few years, in- 
spired by the breakthrough result of Ajtai in 1996 [Q| , who constructed a one-way 
function with average-case security provably related to the worst-case complexity 
of hard lattice problems. The attractiveness of lattice-based cryptography stems 
from its provable security guarantees, well studied theoretical underpinnings, 
simplicity and potential efficiency (Ajtai’s one-way function is a matrix-vector 
multiplication over a small finite field), and also the apparent security against 
quantum attacks. The main complexity assumption is the hardness of approxi- 
mate versions of the Shortest Vector Problem (SVP). The GapSVP 7 („) problem 
consists in, given a lattice of dimension n and a scalar d, replying YES if there 
exists a non-zero lattice vector of norm < d and NO if all non-zero lattice vectors 
have norm > 7 (n)d. The complexity of GapSVP 7 ( n ) increases with n, but de- 
creases with 7 (n). Although the latter is believed to be exponential in n for any 

M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 61 T |63sJ 2009. 
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polynomial 7 (n), minimizing the degree of 7 (n) is very important in practice, to 
allow the use of a practical dimension n for a given security level. 
Lattice-based public key encryption. The first provably secure lattice- 
based cryptosystem was proposed by Ajtai and Dwork (3j , and relied on a variant 
of GapSVP in arbitrary lattices (it is now known to also rely on GapSVP |l!)l ) . 
Subsequent works proposed more efficient alternatives jH.'II.'K )l!)l2%] . The current 
state of the art |E(ff2%| is a scheme with public/private key length 0(n 2 ) and 
encryption/decryption throughput of 0{n) bit operations per message bit. Its 
security relies on the quantum worst-case hardness of GapSVP^oi.s^ in arbi- 
trary lattices. The security can be de-quantumized at the expense of both in- 
creasing 7 (n) and decreasing the efficiency, or relying on a new and less studied 
problem [28j . In parallel to the provably secure schemes, there have also been 
heuristic proposals f I III 21 . In particular, unlike the above schemes which use 
unstructured random lattices, the NTRU encryption scheme m exploits the 
properties of structured lattices to achieve high efficiency with respect to key 
length (O(n) bits) and encryption/decryption cost (0(1) bit operation per mes- 
sage bit). Unfortunately, its security remains heuristic and it was an important 
open challenge to provide a provably secure scheme with comparable efficiency. 

Provably Secure Schemes from Ideal Lattices. Micciancio |2D! intro- 
duced the class of structured cyclic lattices, which correspond to ideals in poly- 
nomial rings 1i[x]/(x n ‘ — 1), and presented the first provably secure one-way 
function based on the worst-case hardness of the restriction of Voly(n )- SVP to 
cyclic lattices. (The problem 7-SVP consists in computing a non-zero vector of 
a given lattice, whose norm is no more than 7 times larger than the norm of 
a shortest non-zero lattice vector.) At the same time, thanks to its algebraic 
structure, jthis one-way function enjoys high efficiency comparable to the NTRU 
scheme ( 0[n ) evaluation time and storage cost). Subsequently, Lyubashevsky 
and Micciancio m and independently Peikert and Rosen showed how to 
modify Micciancio’s function to construct an efficient and provably secure colli- 
sion resistant hash function. For this, they introduced the more general class of 
ideal lattices, which correspond to ideals in polynomial rings Z[x]/f(x). The col- 
lision resistance relies on the hardness of the restriction of Voly(n )- SVP to ideal 
lattices (called "Po/y(n)-Ideal-SVP). The average-case collision-finding problem 
is a natural computational problem called Ideal-SIS, which has been shown to 
be as hard as the worst-case instances of Ideal-SVP. Provably secure efficient 
signature schemes from ideal lattices have also been proposed p 1 81 1 - r >l 1 (il I H . but 
constructing efficient provably secure public key encryption from ideal lattices 
was an interesting open problem. 

Our results. We describe the first provably CPA-secure public key encryp- 
tion scheme whose security relies on the hardness of the worst-case instances of 
0(n 2 )-Ideal-SVP against subexponential quantum attacks. It achieves asymp- 
totically optimal efficiency: the public/private key length is 0(n) bits and the 
amortized encryption/decryption cost is 0(1) bit operations per message bit 
(encrypting f2(n) bits at once, at a 0(n) cost). Our security assumption is 
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that 0(n 2 )-Ideal-SVP cannot be solved by any subexponential time quantum 
algorithm, which is reasonable given the state-of-the art lattice algorithms j!T7i| . 
Note that this is stronger than standard public key cryptography security as- 
sumptions. On the other hand, contrary to most of public key cryptography, 
lattice-based cryptography allows security against subexponential quantum at- 
tacks. Our main technical tool is a re-interpretation of Regev’s quantum reduc- 
tion £2| between the Bounded Distance Decoding problem (BDD) and sampling 
short lattice vectors. Also, by adapting Ajtai’s trapdoor generation algorithm |2j 
(or more precisely its recent improvement by Alwen and Peikert (Sj) to structured 
ideal lattices, we are able to construct efficient provably secure trapdoor sig- 
natures, ID-based identification schemes, CCA-secure encryption and ID-based 
encryption. We think these techniques are very likely to find further applications. 

Most of the cryptosystems based on general lattices f.T>l.'l()l. v > II9I2H| rely on 
the average-case hardness of the Learning With Errors (LWE) problem intro- 
duced in E3- Our scheme is based on a structured variant of LWE, that we 
call Ideal-LWE. We introduce novel techniques to circumvent two main difficul- 
ties that arise from the restriction to ideal lattices. Firstly, the previous cryp- 
tosystems based on unstructured lattices all make use of Regev’s worst-case to 
average-case classical reduction £2| from BDD to LWE (this is the classical step 
in the quantum reduction of £21 from SVP to LWE). This reduction exploits 
the unstructured-ness of the considered lattices, and does not seem to carry over 
to the structured lattices involved in Ideal-LWE. In particular, the probabilistic 
independence of the rows of the LWE matrices allows to consider a single row 
in £21 Cor. 3.10]. Secondly, the other ingredient used in previous cryptosystems, 
namely Regev’s reduction E3 from the computational variant of LWE to its 
decisional variant, also seems to fail for Ideal-LWE: it relies on the probabilistic 
independence of the columns of the LWE matrices. 

Our solution to the above difficulties avoids the classical step of the reduc- 
tion from f.'i.'ij altogether. Instead, we use the quantum step to construct a new 
quantum average-case reduction from SIS (the unstructured variant of Ideal-SIS) 
to LWE. It also works from Ideal-SIS to Ideal-LWE. Combined with the known 
reduction from worst-case Ideal-SVP to average-case Ideal-SIS Id, we obtain a 
quantum reduction from Ideal-SVP to Ideal-LWE. This shows the hardness of 
the computational variant of Ideal-LWE. Because we do not obtain the hardness 
of the decisional variant, we use a generic hardcore function to derive pseudoran- 
dom bits for encryption. This is why we need to assume the exponential hardness 
of SVP. The encryption scheme follows as an adaptation of [01 Sec. 7.1]. 

The main idea of our new quantum reduction from Ideal-SIS to Ideal-LWE is 
a re-interpretation of Regev’s quantum step in 1331- The latter was presented as 
a worst-case quantum reduction from sampling short lattice vectors in a lattice L 
to solving BDD in the dual lattice L. We observe that this reduction is actually 
stronger: it is an average-case reduction which works given an oracle for BDD in L 
with a normally distributed error vector. Also, as pointed out in [0] , LWE can be 
seen as a BDD with a normally distributed error in a certain lattice whose dual 
is essentially the SIS lattice. This leads to our SIS to LWE reduction. Finally 
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we show how to apply it to reduce Ideal-SIS to Ideal-LWE - this involves a 
probabilistic lower bound for the minimum of the Ideal-LWE lattice. We believe 
our new SIS to LWE reduction is of independent interest. Along with |221, ^ 
provides an alternative to Regev’s quantum reduction from GapSYP to LWE. 
Ours is weaker because the derived GapSVP factor increases with the number 
of LWE samples, but it has the advantage of carrying over to the ideal case. Also, 
when choosing practical parameters for lattice-based encryption (see, e.g., E3|), 
it is impractical to rely on the worst-case hardness of SVP. Instead, the practical 
average-case hardness of LWE is evaluated based on the best known attack which 
consists in solving SIS. Our reduction justifies this heuristic by showing that it 
is indeed necessary to (quantumly) break SIS in order to solve LWE. 

Road-map. We provide some background in Section |21 Section El shows how to 
hide a trapdoor in the adaptation of SIS to ideal lattices. Section 0] contains the 
new reduction between SIS and LWE. Finally, in Sectional we present our CPA- 
secure encryption scheme and briefly describe other cryptographic constructions. 

Notation. Vectors will be denoted in bold. We denote by (•,•) and || • || the 
inner product and the Euclidean norm. We denote by p a ( x ) (resp. u s ) the stan- 
dard n-dimensional Gaussian function (resp. distribution) with center 0 and 
variance s, i.e., p a (x ) = exp(— 7 t||x|| 2 /s 2 ) (resp. u s (x) = p s (x)/s n ). We use 
the notations O(-) and !?(•) to hide poly-logarithmic factors. If D\ and D-2 are 
two probability distributions over a discrete domain E, their statistical distance 
is A(Di,D2 ) = ISxe-E \Di(x) ~ -D2 (x) |. If a function / over a countable do- 
main E takes non-negative real values, its sum over an arbitrary F C E will be 
denoted by f(F). If q is a prime number, we denote by Z g the field of integers 
modulo q. We denote by \P S the reduction modulo q of v a . 

2 Reminders and Background Results on Lattices 

We refer to for a detailed introduction to the computational aspects of lat- 
tices. In the present section, we remind the reader very quickly some fundamental 
properties of lattices that we will need. We then introduce the so-called ideal 
lattices, and finally formally define some computational problems. 

Euclidean lattices. An n-dimensional lattice L is the set of all integer lin- 
ear combinations of some linearly independent vectors bi,...,b n £ M", i.e., 
L = The bj’s are called a basis of L. The ith minimum A,;(T) is the 

smallest r such that L contains i linearly independent vectors of norms < r. 
We let A'j* 3 (L) denote the first minimum of L with respect to the infinity norm. 
If B = (61, . . . , b n ) is a basis, we define its norm by ||B|| = max ||6, || and its 
fundamental parallelepiped by P(B) = { JV c,;6, c £ [0, 1)”}. Given a basis B 
for lattice L and a vector c £ ffi n , we define c mod L as the unique vector 
in P(B) such that c — (c mod L) £ L (the basis being implicit). For any lat- 
tice L and any s > 0, the sum p s (L) is finite. We define the lattice Gaussian 
distribution by -Dl iS (&) = for any b £ L. If L is a lattice, its dual L is the 

lattice {b £ R n | Vb £ L, ( b , b) £ Z}. We will use the following results. 
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Lemma 1 (PHI Lemma 2.11] and ]27i, Lemma 3.5]). For any x in an n- 

dimensional lattice L and s > 2-^/ln(10n) /rr / (L) , we have Dl }S (x) < 2 _n+1 . 

Lemma 2 m Lemma 2.10]). Given an n-dimensional lattice L, we have 
Pra:-Di„ s [||*|| > Ss/n\ < 2~ n+1 . 

Ideal lattices. Ideal lattices are a subset of lattices with the computationally 
interesting property of being related to polynomials via structured matrices. The 
n-dimensional vector-matrix product costs 0(n) arithmetic operations instead 
of 0(n 2 ). Let / e Z[i] a monic degree n polynomial. For any g £ Q[ie], there is a 
unique pair (q, r) with deg(r) < n and g = qf + r. We denote r by g mod / and 
identify r with the vector r £ Q" of its coefficients. We define rot /(r) £ Q” xn as 
the matrix whose rows are the x l r(x) mod /(ir)’s, for 0 < i < n. We extend that 
notation to the matrices A over Q[x]/ f, by applying rot/ component-wise. Note 
that rot/(<7i)rot/(<72) = rot/(gi <72) for any <71, <72 € Q [ x \/f- The strengths of our 
cryptographic constructions depend on the choice of /. Its quality is quantified 
by its expansion factor (we adapt the definition of HZI to the Euclidean norm): 

EF(/, k ) = max j ^ ^ | g £ Z[x] \ {0} and deg(g) < k (deg (/) - 1) | , 

where we identified the polynomial g mod / (resp. g) with the coefficients vector. 
Note that if deg(g) < n, then || rot / (//) || < EF(/, 2) • \\g\\. We will concentrate 
on the polynomials x 2 +1, although most of our results are more general. We 
recall some basic properties of x 2 +1 (see (3 for the last one) . 

Lemma 3. Let k > 0 and n = 2 k . Then f(x) = x n + 1 is irreducible in Q[a;]. 
Its expansion factor is < \J2. Also, for any g = Yli< n 9i xl e Q [*]//> we 
have rot f(g) T = rot f(g) where g = go — ^2,\<i <n 9n-iX l ■ Furthermore, if q is 
a prime such that 2n\(q — 1), then f has n linear factors in Z q [x\. Finally, 
if k > 2 and q is a prime with q = 3 mod 8, then f = /1/2 mod q where each fi 
is irreducible in Z q [x ] and can be written fi = x n t 2 + tiX n / 4 — 1 with ti £ Z q . 

Let I be an ideal of Z[x]/f, i.e., a subset of Z [x\/f closed under addition and 
multiplication by any element of Z [x]/f. It corresponds to a sublattice of Z". 
An f-ideal lattice is a sublattice of Z n that corresponds to an ideal I C Z [x\/f. 
Hard lattice problems. The most famous lattice problem is SVP. Given a basis 
of a lattice L, it aims at finding a shortest vector in L\ {0}. It can be relaxed by 
asking for a non-zero vector that is no longer than q(n) times a solution to SVP, 
for a prescribed function q(-). The best polynomial time algorithm [4135] solves 7- 
SVP only for a slightly subexponential 7. When 7 is polynomial in n, then the 
most efficient algorithm has an exponential worst-case complexity both in 
time and space. If we restrict the set of input lattices to ideal lattices, we obtain 
the problem Ideal-SVP (resp. 7-Ideal-SVP), which is implicitly parameterized 
by a sequence of polynomials / of growing degrees. No algorithm is known to 
perform non-negligibly better for Ideal-SVP than for SVP. It is believed that 
no subexponential quantum algorithm solves the computational variants of SVP 
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or Ideal-SVP in the worst case. These worst-case problems can be reduced to 
the following average-case problems, introduced in [H and | 0 j. 

Definition 1 . The Small Integer Solution problem with parameters q(-), m(-), 
/?(•) (SlSq !rrit f} ) is as follows: Given n and a matrix G sampled uniformly in 
find e £ Z m (") \ {0} such that e T G = 0 mod q(n ) (the modulus be- 
ing taken component-wise) and ||e|| < f3(n). The Ideal Small Integer Solution 
problem with parameters q,m,(3 and f (Ideal-SIS^ m is as follows: Given n 
and m polynomials g i, . . . ,g m chosen uniformly and independently in Z q [x\/f, 
find ex,..., e m £ Z[x\ not all zero such that JV <TO = 0 in Z q [x\/ f and ||e|| < 
(3, where e is the vector obtained by concatenating the coefficients of the ej ’s. 

The above problems can be interpreted as lattice problems. If G £ Z™ Xrt , then 
the set G 1 - = {b £ Z m \ b T G = 0 mod q} is an m-dimensional lattice and solving 
SIS corresponds to finding a short non-zero vector in it. Similarly, Ideal-SIS 
consists in finding a small non-zero element in the Z [a;] //-module M ± (g) = 
{6 £ (Z [x\/f) m | ( b,g } = 0 mod q}, where g = {g\, . . . ,g m )• It can be seen as 
a lattice problem by applying the rot/ operator. Note that the m of SIS is n 
times larger than the m of Ideal-SIS. Lyubashevsky and Micciancio HZ! reduced 
Ideal-SVP to Ideal-SIS. The approximation factors in nz! are given in terms 
of the infinity norm. For our purposes, it is more natural to use the Euclidean 
norm. To avoid losing a \fn factor by simply applying the norm equivalence 
formula, we modify the proof of m- We also adapt it to handle the case where 
the Ideal-SIS solver has a subexponentially small success probability, at the cost 
of an additional factor of 0(\/n) in the SVP approximation factor. 

Theorem 1 . Suppose that f is irreducible over Q. Let m = Voly(n) and q = 
1 ?(EF(/, 3)/3m 2 n) be integers. A polynomial-time (resp. subexponential-time) al- 
gorithm solving Ideal-SIS^ TO * with probability 1 /Voly(n) (resp. 2~° ( n )) can be 
used to solve 7 -Ideal-SVP in polynomial-time (resp. subexponential-time) with 
7 = 0(EF 2 (/, 2)/?mn 1 / 2 ) (resp. 7 = 0(EF 2 (/, 2 )/3mn) ). 

The problem LWE is dual to SIS in the sense that if G £ Z” ixn is the SIS- 
matrix, then LWE involves the dual of the lattice G ± . We have G 1 - = i L(G ) 
where L(G) = {b £ Z m \ 3s £ Z”, Gs = b mod q}. 

Definition 2. The Learning With Errors problem vrith parameters q, m and a 
distribution x on M/[0, q) (TWE gim;x / is as follows: Given n, a matrix G £ Z™ xn 
sampled uniformly at random and Gs + e £ (R/[0, q)) n , where s £ Z™ is chosen 
uniformly at random and the coordinates of e £ (R/[ 0 ,q , )) m are independently 
sampled from x> find s - The Ideal Learning With Errors problem with parame- 
ters q,m, a distribution x on R/[0,g) and f (Ideal-LWE^ q . x ) is the same as 
above, except that G = rot f(g) with g chosen uniformly in (Z q [x\/ f) m . 

We will use the following results on the LWE and Ideal-LWE lattices. 
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Lemma 4. Let n,m and q be integers with q prime, m > 5n log q and n > 10. 
Then for all but a fraction < q~ n of the G’s inh™ xn , we have Xf > (L(G)) > q / 4 
and Ai (L(G)) > 0.07 ^/rnq. 

Lemma 5. Let n, m and q be integers with q = 3 mod 4 prime and m > 41 log q 
and n = 2 k > 32. Then for all but a fraction < q~ n of the g’s in (Z q [x]/f) m , 
we have Af 3 (L(rot/(g))) > q/A and Ai(L(rot/(g))) > 0.017 ^mnq. 

3 Hiding a Trapdoor in Ideal-SIS 

In this section we show how to hide a trapdoor in the problem Ideal-SIS. Aj- 
tai P| showed how to simultaneously generate a (SIS) matrix A € Z™ Xn and 
a (trapdoor) basis S = (si, ... ,s m ) G Z mxm of the lattice A 1 - = {b 6 Z m : 
h T A = 0 mod q}, with the following properties: 

1. The distribution of A is close to the uniform distribution over Z™ x ". 

2. The basis vectors «i, . . . , s m are short. 

Recently, Alwen and Peikert 0 improved Ajtai’s construction in the sense that 
the created basis has shorter vectors: ||5|| = 0(nlog q) with m = 12 {n log q) 
and overwhelming probability and Ills'll = 0(\/n log q) with m = 12 (n log 2 q). 
We modify both constructions to obtain a trapdoor generation algorithm for the 
problem Ideal-SIS, with a resulting basis whose norm is as small as the one of 0. 

Before describing the construction, we notice that the construction of 0 relies 
on the Hermite Normal Form (HNF), but that here there is no Hermite Normal 
Form for the rings under scope. We circumvent this issue by showing that except 
in negligibly rare cases we may use a matrix which is HNF-like. 

Theorem 2. There exists a probabilistic polynomial time algorithm with the fol- 
lowing properties. It takes as inputs n. a, r, an odd prime q, and integers m i, m 2 . 
It also takes as input a degree n polynomial f € Z[a;] and random polynomials 
ai € (Z q [x]/f) mi . We let f = Yl i<t fi be the factorization of f over Z q . We 

let k = fl + logg], A = (rii<t + (jr) deS ^’) — l) an< ^ m = mt + m 2 - The 
algorithm succeeds with probability > 1 — p over a\, where p = (1 — ]^ )<t (l — 

q— deg ' When it does, it returns a = € (^g[ a; ]//) m am 1 a basis S of 

the lattice rot/(a) , such that: 

1. The distance to uniformity of a is at most p + m 2 A. 

2. The quality of S is as follows: 

— If mi > max{cr, k, r} and m 2 > k, then || S'!! < EF(/, 2) ■ \[2 Kr^n 3 / 2 . 
Additionally, ||S|| < EF(/, 2)y/3anr-n with probability i_ 2 - a +°( lo s nrn i r ) 
for a super-logarithmic function a = a(n) = w(logn). 

- Ifmi> max{(j,K,r} and m 2 > nmi, then US'D < EF(/, 2)(4 v / nr + 3). 

3. In particular, for f = x 2 +1 with k > 2 and a prime q with q = 3 mod 8, 
the following holds: 
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— We can set a — 1 and r - 1 1 + log 3 q\ . Then, the error probability is 
p = q~ n ( n l and the parameter A is 2~ n ^- n \ 

— If mi, m 2 > k, then ||5|| < \J Ganr ■ n = 0(^/anlogq) with probability 
1 — 2 _a +°( logTimir ') for a super-logarithmic function a = a(n) = w(logn). 
- Ifm\>K and m 2 > nmi, then ||5|| < \/2(4 y/nr + 3) = 0{\/n\og q). 

In the rest of this section, we only describe the analog of the second construction 
of Alwen and Peikert, i.e., the case m 2 > ktoi , due to lack of space. 


3.1 A Trapdoor for Ideal-SIS 

We now construct the trapdoor for Ideal-SIS. More precisely, we want to simul- 
taneously construct a uniform a £ lZ rn with TZ = Z q [x\/f, and a small basis S 
of the lattice A 1 - where A = rot/ (a). For this, it suffices to find a basis of the 
module M J -(a) = {ye 7 Z™ | (y, a) = 0 mod q}, with R 0 = Z[x :]//. 

The principle of the design. In the following, for two matrices X and Y, 
[X\Y] denotes the concatenation of the columns of X followed by Y and [X; Y] 
denotes the concatenation of the rows of X and the rows of Y. 

We mainly follow the Alwen-Peikert construction. Let m\ > cr, r. Let us as- 
sume that we generate random polynomials A\ = [ai, . . . , a mi ] T £ 7Z mi x 1 . 
We will construct a random matrix A 2 £ lZ rn ' 2 x 1 with a structured matrix 
S £ 7 Z™ xm such that S A = 0 and S is a basis of the module M ± (a), where 
A = [ Ai ; A 2 \. We first construct an HNF-like basis F of the module M ± (a) with 
A. Next, we construct a unimodular matrix Q such that S = QF is a short basis 
of the module. More precisely, S has the following form: 


S = 


u 



Note that, by setting B lower triangular with diagonal coefficients equal to 1, 
the matrix Q is unimodular. 

In this design principle, we want FA = 0. Hence, we should set 


HAi = 0 and A 2 = -UAi. 

Notice that, in order to prove that F is a basis of A - 1 , it suffices to show that 
H is a basis of Af . The first equation is satisfied by setting H be an HNF- 
like matrix (see below) . By setting U = G + R, with G to be defined later on 
and R a random matrix, we have that A 2 is almost uniformly random in 1Z by 
Micciancio’s regularity lemma (Lemma EJ) ■ More precisely, the i-th row of R is 
chosen from ({— l,0,l}") r X ({0}") mi_r . 

Lemma 6 (Adapted from fffik Th. 4.2]). Let F be a finite field and f £ F[a:] 
be monic and of degree n > 0. Let R be the ring F [x\/f. Let DC F and r > 0. 
For ai , . . . , a r £ R, we denote by H(ai, , a r ) the random variable Yli< r e 
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R where the bi’s are degree < n polynomials with coefficients chosen inde- 
pendently and uniformly in D. If Ui, ... ,U r are independent uniform random 
variables in R, then the statistical distance to uniformity of (Ui,...,U r , 
H(U \, . . . , U r )) is below: 



where f = fli<t /* the factorization of f over F. 

We show below how to choose P and G such that PG = H — I mi . With this 
relation, the design principle form of S therefore implies that V = —H + P(G + 
R) = PR — I mi , and D = B(G + R). Our constructions for P, G, B also ensure 
that P, B and BG have ‘small’ entries so that S has ‘small’ entries. 

A construction of H without HNF. We start with how to construct H for 
A\ = [a \, . . . , a mi ] T G P miXl . Since mi > max{cr, K,r}, we have a*. G T Z* 
for some index i* with probability at least 1 — p, where TZ* denotes the set of 
invertible elements of TZ. For now, we set i* = 1 for simplicity. Using this a*« , 
we can construct an HNF-like matrix H: the first row is qe i and the i-th row is 
faei + e* for i = 2, . . . , mi, where e t is a row vector in P/ 1 such that the i-the 
element is 1 and others are 0, and hi = -a i - a j" 1 mod q such that hi G [0, q) n . 
Let hi denote the <-th row of H. By the definition of H, H ■ A\ = 0 mod q. Thus, 
each row vector hi is in M ± (ai), where ci| = A\. It is obvious that hi , . . . , h mi 
are linearly independent over Pq. Hence, we need to only show that H is indeed 
the basis of M ± (ai), but this is a routine work. 

Next, we consider the case where i* ^ 1. In this case, we swap rows 1 and i* 
of Ai so that ai G TZ* , and call it A\ . Applying the method above, we get a 
basis H' of A - L (A , 1 ). By swapping columns 1 and i* and rows 1 and i* of H', 
we get a basis H of A ± (Ai). In the following, we denote by i* the index i such 
that ai G TZ* and hi^ = q. Note that our strategy fails if there is no index i such 
that ai G TZ*: this is not an issue, as this occurs only with small probability. 

Preliminaries of the construction. Hereafter, we set W = BG. We often use 
the matrix T K = (Uj) G TZq XK , where p, = 1, U+i,* = —2, and all other p/s 
are 0. Notice that the i-th row of T~ l is (2*- 1 , 2 i ~\. . . , 1, 0, . . . ,0) 6 Pg. 

3.2 An Analogue to the Second Alwen-Peikert Construction 

The idea of the second construction in jS] is to have G contain the rows of H—I mi . 
This helps decrease the norms of the rows of P and V. To do so, we define 
B = diag(T re , . . . ,T K ,I m2 _ miK ). Note that B~ x = diag(T / 1 , . . . ,T~ 1 ,I m2 - miK ). 

Let h'j denote the j-th row of H- I mi . Let W = [Wu W 2 ; . . . ; W mi \ 0], where 
Wj = [wj iK ] . . . ; Wj^i) G Pg Xmi . We compute the tuy/s such that h'j = 2 fe_1 - 

Wj,k and the components of all Wj^k’s are polynomials with coefficients in {0, 1}. 
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By this construction, T” 1 • Wj contains h' in the last row. Then, G = B~ x ■ W 
contains rows hi for j = 1 ..... mj. The matrix P = [pi; . . . ; p mi ] picks all rows 
h[, ... , h' mi in G by setting Pj = e K j £ 1Z™ 2 . 

The norm of S is max{||S'i||, H&H}, where Si = [V\P] and S 2 = [D\B]. For 
simplicity, we only consider the case where / = x n + 1. In the general case, the 
bound on 11511 involves an extra EF(/, 2) factor. 

We have that ||PG|| 2 = | IT 1 1 2 < n, since the entries of hi are all 0 except one 
which is either hi- or q — 1. Hence, we obtain that 

||5 2 || 2 < ||P>|| 2 + ||P|| 2 < (3^ + V^f + 5< (4^f + 3) 2 . 

It is obvious that ||P|| < 1. Additionally, we have that ||Pf?|| 2 < nr. Therefore: 

Plf < ||F|| 2 + ||P|| 2 < (\/nr + l) 2 + 1 < + 2) 2 , 

which completes the proof of Theorem |21 □ 


4 From LWE to SIS 

We show that any efficient algorithm solving LWE with some non-negligible 
probability may be used by a quantum machine to efficiently solve SIS with 
non-negligible probability. A crucial property of the reduction is that the matrix 
underlying the SIS and LWE instances is preserved. This allows the reduction 
to remain valid while working on Ideal-SIS and Ideal-LWE. 

Theorem 3. Let q,m,n be integers, and a £ (0,1) with n > 32, Voly(n) > 
m > 5nlog q and a < min ^-^—^===,0.006^. Suppose that there exists an algo- 
rithm that solves LWE in time T and with probability e > 4m exp (— • 

Then there exists a quantum algorithm that solves SIS m q .Vm in time Poly{T,n) 

and with probability — 0(e 5 ) — 2~°^ n \ The result still holds when replac- 
ing LWE by Ideal-LWE-^ and SIS by Ideal-SIS-^, for f = x n +l with n = 2 k > 32, 
m > 41 log q and q = 3 mod 8. 

When a = 0(l/y/n), the reduction applies even to a subexponential algorithm 
for LWE (with success probability e = transforming it into a subexpo- 

nential quantum algorithm for SIS (with success probability e = 2 _o ( T 0). The 
reduction works also for larger a = 0(1 / ^/log n) , but in this case only applies to 
polynomial algorithms for LWE (with success probability e = f2(l/Voly(n))). 

The reduction is made of two components. First, we argue that an algorithm 
solving LWE provides an algorithm that solves a certain bounded distance de- 
coding problem, where the error vector is normally distributed. In a second step, 
we show that Regev’s quantum algorithm O Lemma 3.14] can use such an al- 
gorithm to construct small solutions to SIS. 
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4.1 From LWE to BDD 

An algorithm solving LWE allows us to solve, for certain lattices, a variation of 
the Bounded Distance Decoding problem. In that variation of BDD, the error 
vector is sampled according to a specified distribution. 

Definition 3. The problem BDD X with parameter distribution x(-) is as follows: 
Given an n-dimensional lattice L and a vector t = b + e where b £ L and e is 
distributed according to x( n )i the goal is to find b. We say that a randomized 
algorithm A solves BDD X for a lattice L with success probability > e if, for 
every b £ L, on input t = b + e, algorithm A returns b with probability > e over 
the choice of e and the randomness of A. 

For technical reasons, our reduction will require a randomized BDD X algorithm 
whose behaviour is independent of the solution vector b, even when the error 
vector is fixed. This is made precise below. 

Definition 4. A randomized algorithm A solving BDD X for lattice L is said 
to be strongly solution-independent (SSI) if, for every fixed error vector e, the 
probability (over the randomness of A) that, given input t = b+ e with b £ L, 
algorithm A returns b is independent ofb. 

We show that if we have an algorithm that solves LWE m9; ^ rii; , then we can 
construct an algorithm solving BDD„ a5 for some lattices. Moreover, the con- 
structed BDD algorithm is SSI. 

Lemma 7. Let q,m,n be integers and a £ (0,1), with rri, log q = Voly(n). 
Suppose that there exists an algorithm A that solves LWE m i;; ^ (i(i in time T and 
with probability e > 4m exp ( — j—t)- Then there exists S C Z™ x ™ of proportion > 
e/2 and an SSI algorithm A! such that if G £ S, algorithm A! solves BDD„ a9 
for L(G) in time T + Voly(n) and with probability > e/4. 

Proof. If G £ Z™ xn and sfZJ are sampled uniformly and if the coordinates 
of e are sampled according to x T nq . then A finds s with probability > e over the 
choices of G , s and e and a string w of internal random bits. This implies that 
there exists a subset S of the G's of proportion > e/2 such that for any G £ S, 
algorithm A succeeds with probability > e/2 over the choices of s, e and w. For 
any G £ S, we have Pr a ,e,w[A(Gs + e, w) = s] > e/2. 

On input t = b+e, algorithm A! works as follows: it samples s uniformly in Z/ ; 
it computes t' = t+ As, which is of the form t! = Gs' + qk + e, where k £ Z m ; it 
calls A on tl mod q and finds s' (with probability > e/2); it then computes e! = 
t! — Gs' mod q and returns t — e'. Suppose that A succeeds, i.e., we have s = s'. 
Then e' = e mod q. Using the standard tail bound on the continuous Gaussian 
and the lower bound on e we obtain that e has a component of magnitude > q/2 
with probability < mexp(— 7r/(2a) 2 ) < e/4. The algorithm thus succeeds with 
probability > e/2 — e/4 = e/4. □ 

We now show that an algorithm solving BDD^^ can be used to solve a quantized 
version of it. This quantization is required for the quantum part of our reduction. 
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The intuition behind the proof is that the discretization grid is so fine (the 
parameter R can be chosen extremely large) that at the level of the grid the 
distribution v s looks constant. 

Lemma 8. Let s > 0 and L be an n- dimensional. Suppose that there exists an 
SSI algorithm A that solves BDD„ s for L in time T and with probability e. Then 
there exists an R, whose bit-length is polynomial in T,n, | log s | and the bit-size 
of the given basis of L, and an SSI algorithm A ' that solves BDD n L/R e within 
a time polynomial in log R and with probability > e — 2~°^ n \ 

At this point, we have an R of bit-length polynomial in T, n, | logo) and an SSI 
algorithm B with run-time polynomial in log R that solves BDD/j L(G)/Ji aq , for 
any G in a subset <S C x n of proportion > s/2, with probability > e/4— 

over the random choices of e and the internal randomness w. In the following we 
assume that on input t = b + e, algorithm B outputs e when it succeeds, rather 
than b. We implement B quantumly as follows: the quantum algorithm Bq maps 
the state |e) | b + e ) | w) to the state \e — B(b+ e, w )) | b + e) |w). 

4.2 A New Interpretation of Regev’s Quantum Reduction 

We first recall Regev’s quantum reduction [821 Lemma 3.14]. It uses a random- 
ized BDD oracle B wc that finds the closest vector in a given lattice L to a given 
target vector, as long as the target is within a prescribed distance d < of L 
(as above, we assume that B wc returns the error vector). It returns a sample from 
the distribution Dj ^ . We implement oracle B wc as a quantum oracle Bq c as 
above. We assume Bq c accepts random inputs of length l. 

1. Set R to be a large constant and build a quantum state which 

is within Ii distance 2~ n ^ of the normalized state corresponding 
t0 E«={o,iR I*) I* mod L ) l w >- 

2. Apply the BDD oracle Bq c to the above state to remove the entanglement 
and obtain a state which is within 1 2 distance 2~ n< ' n> of the normalized state 
corresponding to J2xe L ,\\x\\<d P -^( x ) |0) I* mod L) |iy). 

3. Apply the quantum Fourier transform over Z^ to the second register to 
obtain a state that is within £2 distance 2~ n ^ of the normalized state 
corresponding to J2 x eL ||®||<f P^”( x ) \ x mod (-^ ' -^))- 

4. Measure the latter to obtain a vector b mod R-L. Using Babai’s algorithm jSJ, 
recover b and output it. Its distribution is within statistical distance 2~ n< ' n> 
of D~ 

L W 

We now replace the perfect oracle Bq c by an imperfect one. 

Lemma 9. Suppose we are given an n-dimensional lattice L, parameters R > 
2’ 2n X n (L) and s < ^7=, and an SSI algorithm B that solves BDD/^ L )S for L with 
run-time T and success probability e. Then there exists a quantum algorithm 1Z 
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which outputs a vector b £ L whose distribution is within distance 1 — e 2 /2 + 
0(e 4 ) + of Dj . It finishes in time polynomial inT + log R. 

Proof. The quantum algorithm TZ is Regev’s algorithm above with parame- 
ter d = fa2 ns < Xl ^ , where Bq c is replaced by the quantum implementa- 
tion Bq of B. We just saw that if the BDD n L/R , s oracle was succeeding with 
probability 1— 2~ n ( n \ then the output vector b would follow a distribution whose 
statistical distance to Dj j_ would be 2~ n ^ n \ To work around the requirement 
that the oracle succeeds with overwhelming probability, we use the notion of 
trace distance between two quantum states, which is an adaptation of the statis- 
tical distance (see (23 Ch. 9]). The trace distance between two (pure) quantum 
states |ti) and fa) is £(|ii) , (£ 2 )) = \/l — | (ti|t 2 ) | 2 - Its most important property 
is that for any generalized measurement (POVM), if Di (resp. Df) is the result- 
ing probability distribution when starting from |ti) (resp. 1 * 2 )) then A(Di , Df) < 
<5(|fi) , fa))- Let fa) denote the state at the end of Step 2 of Regev’s algorithm 
when we use B wc , and let fa) denote the state that we obtain at the end of 
Step 2 when we use B. We upper bound £(|fi) , fa)) as follows. 

Since B wc (x mod L,w) = x for | ,x | < d, we have that fa) is within fa distance 
(and hence trace distance) 2~ n ^ of the normalized state 

fa)=2~V 2 \] Dd L/R,s ( X ) 1°) I* mod L ) \ W ) > 

we {Q,ip*e& 

where D R y R s denotes the normalized distribution obtained by truncating D L / R s 
to vectors of norm < d. On the other hand, for the imperfect oracle B, we have 
that fa) is within trace distance 2~ n ^ of the normalized state 

1 1' 2 ) = 2 - ^ 2 ^ ^2 \j ^ l/r s (a0 I* — mod L, w)) \x mod L) \w) . 

we{0,l} e a;e% 

Let Sb = {(aqw) £ -L X {0, 1} £ | ||cc|| < d and B(x mod L,w) = x}. 
Notice that, if ( x,w ) 0 Sb, the states \x — B(x mod L, w)) \x mod L) |w) 
and |0) \x ' mod L) \w') are orthogonal for all (x' , «/). Furthermore, if (x, w) £ 
Sb, the states |0) \x mod L) w) and |0) \x' mod L) \w’) are orthogonal for 
all (x',w') ^ (x, w) with 1 1 a: 7 1 1 < d, because the mapping x 1 — > a: mod 
L is 1-1 over x of norm < d < Ai(L)/2. It follows that | (fa fa) | = 
T,( x ,w)es B 2 ~ t D , )/ R fax). Hence, faifa) is equal to the probability p 
that B(x mod L,w) = x, over the choices of x from the distribution D^j Rs 
and w uniformly random in {0, 1} . By Lemma |2( using the fact that d > fans, 
we have p > p—2~ n ^ n \ where p is the corresponding probability when x is sam- 
pled from D l / R s . Finally, we have p = Y) x Dl/r,s{&) Pr w [B{x mod L, w) = x\. 
By the strong solution-independence of B, we have Pr w [B(x mod L, w) = x] = 
Pr w [B(b + x. w) = x] for any fixed b £ L. Therefore, p is the success probabil- 
ity of B in solving BDD£> l/b s , so p > e by assumption. Overall, we conclude 
that <5(|ti) , fa)) < fal — s' 2 + 2^ n ( rl K and hence the output of TZ is within sta- 
tistical distance 1 — e 2 /2 + 0(e 4 ) + of j_ , as claimed. □ 
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To prove Theorem 0 we apply Lemma 0 to the lattices L(G) for G £ S, with 
algorithm B. For that, we need to ensure that the hypothesis aq < * s 

satisfied. From Lemma0(resp. Lemma0in the case of Ideal- LWE) , we know that 
with probability 1 — 2~ n( - n '> over the choice of G in Z™ xn , we have Xf > (L(G)) > f 
and Ai (L(G)) > 0.07 y/mq. For such ‘good’ G’s, the hypothesis aq < is 

satisfied, since a < 0.006. The set S' of the G’s in S for which that condition is 
satisfied represents a proportion > e/2 — 2~ n ^ of Z"‘ x n . Suppose now that G £ 
S'. Lemma 0 shows that we can find a vector s £ G 1 - = qL(G) that follows a 
distribution whose distance to Dq± j^_ is A = 1 — ^ + Ofy 4 ) + 2~°^ n \ Thanks 
to Lemmas 0 and 0 (since G £ S and a < l/(10fyln(10m)), the hypothesis 
of Lemma 0 is satisfied), we have that with probability > 1 — 2 _fi (") — A = 
I 2 — 0(e 4 ) — the returned s is a non-zero vector of G 1 - whose norm 

is < Multiplying by the probability > e/2 — 2~ QI ' n) that G £ S' gives the 
claimed success probability and completes the proof of Theorem 0 □ 


5 Cryptographic Applications 

We now use the results of Sections 0 and 0 to construct efficient cryptographic 
primitives based on ideal lattices. This includes the first provably secure lattice- 
based public-key encryption scheme with asymptotically optimal encryption and 
decryption computation costs of 0(1) bit operations per message bit. 


5.1 Efficient Public-Key Encryption Scheme 

Our scheme is constructed in two steps. Firstly, we use the LWE mapping 
(s, e) 1 — *• G ■ s + e mod q as an injective trapdoor one-way function, with the 
trapdoor being the full-dimensional set of vectors in G 1 - from Section 0 and the 
one-wayness being as hard as Ideal-SIS (and hence Ideal-SVP) by Theorem 0 
This is an efficient ideal lattice analogue of some trapdoor functions presented 
in (9I28[ for arbitrary lattices. Secondly, we apply the Goldreich-Levin hardcore 
function based on Toeplitz matrices [I ( II Sec. 2.5] to our trapdoor function, and 
XOR the message with the hardcore bits to obtain a semantically secure encryp- 
tion. To obtain the 0(1) amortized bit complexity per message bit, we use S l(n) 
hardcore bits, which induces a subexponential loss in the security reduction. 

Our trapdoor function family Id-Trap is defined in Figure 0 For security 
parameter n = 2 k , we fix f(x) = x n + 1 and q = Voly(n) a prime satisfy- 
ing q = 3 mod 8. From Lemma 0 it follows that / splits modulo q jito two 
irreducible factors of degree n/2. We set a = 1, r = 1 + log 3 q = 0(1) and 
m = ((log q] +l)cr + r = 0(1). We define 72. — Z q [x]/f. The following lemma en- 
sures the correctness of the scheme (this is essentially identical to (20 Sec. 4.1]) 
and asserts that the evaluation and inversion functions can be implemented 
efficiently. 
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- Generating a function with trapdoor. Run the algorithm from Theorem |21 us- 
ing / = x n + 1, n, q, r, a, m as inputs. Suppose it succeeds. It returns g £ (Z 9 [rr]//) m 
(function index) and a trapdoor full-rank set S of linearly independent vectors 
in rot j(g) x C z™ nXmn with ||S|| < y/2(4y/nr + 3) =: L (we have L = 0(y/n)). 

- Function evaluation. Given function index g, we define the trapdoor function 
h g : Zq x Z™ n —* Z™” as follows. On input s uniformly random in ZJ and e £ Z™ n 
sampled from \P aq (defined as the rounding of T„ q to the closest integer vector), we 
compute and return: c = h g (s,e) := rot j{g) ■ s + e mod q. 

- Function inversion. Given c = h g (s,e) and trapdoor S, compute d = S T ■ c mod q 
and e' m S~ T d (in Q). Compute u = c — e! mod q and s' = (rot/(; 7 i)) -1 iti mod q, 
where ui consists of the first n coordinates of u. Return (s',e'). 

Fig. 1. The trapdoor function family Id-Trap 

Lemma 10. Let q > 2 y/mnL and a = o(l / (L\J\og n)). Then for any s £ K 
and for e sampled from tT aq , the inversion algorithm recovers (s, e) with proba- 
bility 1 — n _w W over the choice of e. Furthermore, the evaluation and inversion 
algorithms for h g can be implemented with run-time 0(n ). 

The one-wayness of Id-Trap is equivalent to the hardness of LWE m . Fur- 
thermore, an instance of LWE m 9 .y> o) can be efficiently converted by rounding to 
an instance of LWE m . This proves Lemma lUl 

Lemma 11. Any attacker against the one-wayness of Id-Trap (with parame- 
ters m, a, q) with run-time T and success probability e provides an algorithm 
for LWE mi , ; ^ g with run-time T and success probability e. 

By combining our trapdoor function with the GL hardcore function [E3 Sec. 2.5] 
we get the encryption scheme of Figure |3 

- Key generation. For security parameter n, run the generation algorithm of Id-Trap 

to get an h g and a trapdoor S. We can view the first component of the domain of h g 
as a subset of iff for ti = 0{n log q) = 0{n). Generate r £ lff +lM uniformly and 
define the Toeplitz matrix Mgl £ (allowing fast multiplication j'ltij ) whose 

ith row is [r t , , r< J+ ,_i]. The public key is (g, r) and the secret key is S. 

- Encryption. Given Im - bit message M with Im = n/ log n = Q(n) and public 
key (g, r), sample (s, e) with s £ Z'"' uniform and e sampled from 'P r , q , and evaluate 
Ci = h g (s. e). Compute C2 = M© ( Mgl • s), where the product Mgl • s is computed 
over Z 2 , and s is viewed as a string over Z 2 J . Return the ciphertext (Ci,C 2 ). 

- Decryption. Given ciphertext (Ci,C 2 ) and secret key ( S,r ), invert Ci to compute 
(s, e) such that h g (s, e) = Ci, and return M = C 2 © ( Mgl ■ s). 

Fig. 2. The semantically secure encryption scheme Id-Enc 


Theorem 4. Any IND-CPA attacker against Id-Enc with run-time T and suc- 
cess probability 1/2 + £ provides an algorithm for Ideal-LWE^ q ., p ^ with run- 
time 0(2 3 ^ M n 3 e -3 • T ) and success probability f2(2~ lM n~ 1 ■ e). 
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Proof. The attacker can be converted to a GL hardcore function distinguisher 
that, given C\ = h g (s,e), Mql, and Im bit string z, for s sampled uniformly 
in ZJ, e sampled from Hf a q, and Mql constructed as in the key generation 
procedure, distinguishes whether z is uniformly random (independent of s and e) 
or z = Mql • s. It has run-time T and advantage e. The result follows by applying 
Lemma 2.5.8, Proposition 2.5.7 and Proposition 2.5.3 in ETQjj. Note that we do 
not need to give the vector e additionally to s as input to the GL function, as e 
is uniquely determined once s is given (with overwhelming probability). □ 

By using Lemma, ITOI and Theorems 0 0and0 we get our main result. 

Corollary 1. Any IND-CPA attacker against encryption scheme Id-Enc with 
run-time 2°'- n ' and success probability 1/2 + 2“°^"^ provides a quantum algorithm 
for O (n 2 ) -Ideal- S VP with f(x) = x n + 1 and n = 2 k , with run-time 2°^ and 
overwhelming success probability. Furthermore, the scheme Id-Enc encrypts and 
decrypts S ?(n) bits within 0(n ) bit operations, and its keys have 0(n) bits. 

5.2 Further Applications 

Our results have several other applications, adapting various known construc- 
tions for unstructured lattices to ideal lattices, as summarised below. 

CCA2-secure encryption. Peikert j2B| derived a CCA2-secure encryption 
scheme from the non-structured variant of the trapdoor function family Id-Trap 
from Figured using the framework of [31 134] for building a CCA2-secure scheme 
from a collection of injective trapdoor functions that is secure under correlated 
product (i.e., one-wayness is preserved if several functions are evaluated on the 
same input). The approach of |2B| can be applied to Id-Trap, using the equality 
between Ideal- LWE*- m and the product of k instances of Ideal- LWE m , multiple 
hardcore bits as in Id-Enc, and instantiating the required strongly unforgeable 
signature with the Ideal-SVP-based scheme of |XB|- By choosing k = 0(n) (the 
bit-length of the verification key in U2I) and a = 0(n -3 / 2 ), we obtain a CCA2- 
secure scheme that encrypts Q(n) bits within 0(n 2 ) bit operations and whose 
security relies on the exponential quantum hardness of O (n 4 ) -Ideal- S VP. 
Trapdoor signatures. Gentry et al. || give a construction of a trapdoor 
signature (in the random oracle model) from any family of collision-resistant 
preimage sampleable functions (PSFs). They show how to sample preimages 
of fcix) = x T G, where G £ Z™ xn , using a full-dimensional set of short vec- 
tors in G ± . By applying this to G = rot f(g) and using the trapdoor genera- 
tion algorithm from Section 0 we obtain a PSF whose collision resistance relies 
on Ideal-SIS, and hence Ideal-SVP, and thus a structured variant of the trapdoor 
signature scheme of jOj, with O(n) verification time and signature length. 
ID-based identification. From lattice-based signatures, we derive ID-based 
identification (IBI) and ID-based signature (IBS). Applying the standard strat- 
egy, we construct lattice-based IBI schemes as follows: The master generates a 
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key pair of a lattice-based signature scheme, say ( G , S’); Each user obtains from 
the master a short vector e such that e T G = H(id), where H is a random oracle; 
The prover proves to the verifier that he/she has a short vector e through the 
Micciancio-Vadhan protocol m ■ This combination yields concurrently secure 
IBI schemes based on 0(n 2 )-SVP and 0(n 2 )-Ideal-SVP in the random oracle 
model. As the MV protocol is witness indistinguishable, we can use the Fiat- 
Shamir heuristic |B1 and derive lattice-based IBS schemes. 

ID-based encryption (IBE). It is shown in jS| that the unstructured variant 
of the above trapdoor signature can be used as the identity key extraction for 
an IBE scheme. This requires a ‘dual’ version of Id-Enc, in which the public key 
is of the form ( g,u ), where u = H(id) is the hashed identity, and the secret 
key is the signature of id, i.e., a short preimage of u under f g {x) = x T rot f(g). 
We construct the ‘dual’ encryption as (C\ . C 2 ) where C\ = h g (s,e) and C% = 
Tf (rot /(n)-s)+M, where MeZj contains the message and T/ (rot f(u)-s) denotes 
the first £ coordinates of rot/(u) • s mod q. By adapting the results of |I3] . we 
show that T((rotf(u) ■ s ) is an exponentially-secure generic hardcore function for 
uniform aeZJ, when £ = o(n). This allows us to prove the IND-CPA security 
of the resulting IBE scheme based on the hardness of Ideal-SVP. 
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Abstract. We describe a public- key encryption scheme based on lat- 
tices — specifically, based on the hardness of the learning with error 
(LWE) problem — that is secure against chosen-ciphertext attacks while 
admitting (a variant of) smooth projective hashing. This encryption 
scheme suffices to construct a protocol for password-based authenticated 
key exchange (PAKE) that can be proven secure based on the LWE as- 
sumption in the standard model. We thus obtain the first PAKE protocol 
whose security relies on a lattice-based assumption. 


1 Password-Based Authenticated Key Exchange 

Protocols for password-based authenticated key exchange (PAKE) enable two 
users to generate a common, cryptographically-strong key based on an initial, 
low-entropy, shared secret (i.e., a password). The difficulty in this setting is to 
prevent off-line dictionary attacks where an adversary exhaustively enumerates 
potential passwords on its own, attempting to match the correct password to 
observed protocol executions. Roughly, a PAKE protocol is “secure” if off-line 
attacks are of no use and the best attack is an on-line dictionary attack where an 
adversary must actively try to impersonate an honest party using each possible 
password. On-line attacks of this sort are inherent in the model of password- 
based authentication; more importantly, they can be detected by the server as 
failed login attempts and (at least partially) defended against. 

Due to the widespread use of passwords, a significant amount of research has 
focused on designing PAKE protocols. Early work [E] (see also (Hj) considered 
a “hybrid” model where users share public keys in addition to a password. In 
the more challenging “password-only” setting clients and servers are required to 
share only a password. Bellovin and Merritt 0 initiated research in this direc- 
tion, and presented a PAKE protocol with heuristic arguments for its security. 
It was not until several years later that formal models for PAKE were devel- 
oped j.'ilbl I II . and provably secure PAKE protocols were shown in the random 
oracle/ideal cipher models [31511 Sj . 
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Goldreich and Lindell m constructed the first PAKE protocol without ran- 
dom oracles, and their approach remains the only one for the plain model where 
there is no additional setup. Unfortunately, their protocol is inefficient in terms of 
communication, computation, and round complexity. (Nguyen and Vadhan jl t)j 
show efficiency improvements, but achieve a weaker notion of security. In any 
case, their protocol is similarly impractical.) The Goldreich-Lindell protocol also 
does not tolerate concurrent executions by the same party. 

Katz, Ostrovsky, and Yung demonstrated the first efficient PAKE proto- 
col with a proof of security in the standard model; extensions and improvements 
of this protocol were given in J!)l(ill tilHi . In contrast to the work of Goldreich 
and Lindell, these protocols are secure even under concurrent executions by the 
same party. On the other hand, these protocols all require a common reference 
string (CRS). While this may be less appealing than the “plain model,” reliance 
on a CRS does not appear to be a serious drawback in the context of PAKE 
since the CRS can be hard-coded into the protocol implementation. A different 
PAKE protocol in the CRS model is given by Jiang and Gong Id 

PAKE based on lattices? Cryptographic primitives based on lattices are ap- 
pealing because of known worst-case/average-case connections between lattice 
problems, as well as because several lattice problems are currently immune to 
quantum attacks. Also, the best-known algorithms for several lattice problems 
require exponential time (in contrast to sub-exponential algorithms for, e.g., fac- 
toring). None of the existing PAKE constructions (in either the random oracle 
or standard models), however, can be instantiated with lattice-based assump- 
tions Q The barrier to constructing a lattice-based PAKE protocol using the 
KOY/GL approach jl 7l9j is that this approach requires a CCA-secure encryp- 
tion scheme (more generally, a non-malleable commitment scheme) with an as- 
sociated smooth projective hash system |7lf) l. (See Section El) Until recently, the 
existence of CCA-secure encryption schemes based on lattices (even ignoring the 
additional requirement of smooth projective hashing) was open. Peikert and Wa- 
ters m gave the first constructions of CCA-secure encryption based on lattices, 
but the schemes they propose are not readily amenable to the smooth projective 
hashing requirement. Subsequent constructions j24l20ll 2| do not immediately 
support smooth projective hashing either. 

1.1 Our Results 

Building on ideas of |24I2( 11 2] . we show a new construction of a CCA-secure 
public-key encryption scheme based on the hardness of the learning with er- 
ror (LWE) problem |22j . We then demonstrate (a variant of) a smooth projective 
hash system for our scheme. This is the most technically difficult aspect of our 
work, and is of independent interest as the first construction of a smooth projec- 
tive hash system (for a conjectured hard-on-average language) based on lattice 

1 To the best of our knowledge this includes the protocol of Goldreich and Lindell m 
which requires a one-to-one one-way function on an infinite domain (in addition to 
oblivious transfer, which can be based on lattice assumptions ED- 
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assumptions. (Instantiating the smooth projective hash framework using lattice 
assumptions is stated as an open question in B50 Finally, we show that our 
encryption scheme can be plugged into a modification of the Katz-Ostrovsky- 
Yung/Gennaro-Lindell framework f 1 71) j to give a PAKE protocol based on the 
LWE assumption. 

Organization of the paper. In Section |2| we define a variant of smooth pro- 
jective hashing (SPH) that we call approximate SPH. We then show in Sectional 
that a CCA-secure encryption scheme having an approximate SPH system suf- 
fices for our desired application to PAKE. 

The main technical novelty of our paper is in the sections that follow. In 
Section 0] we review the LWE problem and some preliminaries. As a prelude to 
our main construction, we show in Section 0 a CPA-secure encryption scheme 
based on the LWE problem, with an associated approximate SPH system. In 
Section 0 we describe how to extend this initial scheme to obtain CCA-security. 

Throughout the paper, we denote the security parameter by n. 

2 Approximate Smooth Projective Hash Functions 

Smooth projective hash functions were introduced by Cramer and Shoup 0; 
we follow (and adapt) the treatment of Gennaro and Lindell 0, who extend 
the original definition. Rather than aiming for utmost generality, we tailor the 
definitions to our eventual application. 

Roughly speaking, the differences between our definition and that of Gennaro- 
Lindell are as follows. (This discussion assumes familiarity with 0; for the reader 
not already familiar with that work, a self-contained description is given below.) 
In 0 there are sets X and L C X: correctness is guaranteed for x £ L, while 
smoothness is guaranteed for a; G X \ L. Here, we require only approximate 
correctness, and moreover only for elements in a subset L C L. Details follow. 

Fix a CCA-secure (labeled) public-key encryption scheme (Gen, Enc, Dec) and 
an efficiently recognizable message space V (which will correspond to the dic- 
tionary of passwords in our application to PAKE). We assume the encryption 
scheme defines a notion of ciphertext validity such that (1) validity of a cipher- 
text (with respect to pk) can be determined efficiently using pk alone, and (2) all 
honestly generated ciphertexts are valid. We also assume no decryption error. 

For the rest of the discussion, fix a key pair ( pk,sk ) as output by Gen(l") 
and let C denote the set of valid ciphertexts with respect to pk. Define sets 
X, {X m } m ex>, and L as follows. First, set 

X = {(label, C, to) | label G {0, 1}"; C £ C; to G D} . 

Next, for to G D let L m = {(label, Enc p fc(label, to), to) | label G {0,1}"} C A; 
i.e., L rn is the set of honestly generated encryptions of to (using any label). Let 
L = U m £T>L m . Finally, define 


L m = {(label, C, m) | label G {0, 1}"; Dec s fc(label, C ) = to} , 
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and set L = U meT>L m . (Recall we assume no decryption error, and so L rn de- 
pends only on pk.) Note that L rn C L rn for all to. Furthermore, for any ciphertext 
C and label G {0, 1}" there is at most one m G V for which (label, C , m) G L. 
Approximate smooth projective hash functions. An approximate smooth 
projective hash function is a collection of keyed functions {H k : X — > {0, l} n }keK, 
along with a projection function a : K x ({0, 1}* x C) — ► S, satisfying notions of 
(approximate) correctness and smoothness : 

Approximate correctness: If x = (label, C, to) G L then the value of H k (x) 
is approximately determined by a(k, label, C) and x (in a sense we will make 
precise below). 

Smoothness: If x £ X \ L then the value of is statistically close to 

uniform given a(k, label, C ) and x (assuming k was chosen uniformly in K). 

We stress that, in contrast to 0 , we require nothing for x G L\L: furthermore, 
even for x G L we require only approximate correctness. We highlight also that, 
as in 0, the projection function a should be a function of label, C only. 

Formally, an e(n)-approximate smooth projective hash function is defined 
by a sampling algorithm that, given pk, outputs (K,G, H = {H k : X — » 
{0, l} n }kei <, S, a : K x ({0, 1}* xC)-»S) such that: 

1. There are efficient algorithms for (1) sampling a uniform k £ K, (2) com- 
puting Hk{x) for all k e K and x e X, and (3) computing a(k, label, C) for 
all k G K and (label, C) S {0, 1}* x C. 

2. For x = (label, C, m) G L the value of Hk (x) is approximately determined 
by a(k, label, C), relative to the Hamming metric. Specifically, let Ham(a, b) 
denote the Hamming distance of two strings a,b G {0,l} n . Then there is 
an efficient algorithm H' that takes as input s = a(k. label, C) and x = 
(label, C, m, r) for which C = Enc p fc(label, to; r) and satisfies: 

Pr[Ham(Ufc(a:), H'(s,x)) >e-n] = negl(n), 

where the probability is taken over choice of k. 

3. For any x = (label, C, to) G X \ L, the following two distributions have 
statistical distance negligible in n: 

{k <— K: s = a(k, label, C) : (s, H k (xj ) } 

and 

{k <- K- s = a{k, label, C); v «- {0, 1}" : (s, «)} . 

3 A PAKE Protocol from Approximate SPH 

We use the standard definition of security for PAKE [311 7191 . 

Here, we show that a modification of the Gennaro-Lindell framework (Hj can 
be used to construct a PAKE protocol from any CCA-secure encryption scheme 
that has associated with it an approximate smooth projective hash function as 
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Common reference string: pk 

Client(tc) 

(VK, SK) <— /C(l fc ) 

*■ - (0, 1}* 
label := VK | Client | Server 


Server(w) 

C := Enc p fc(label, w\ r) 

Client |VK|C 

^<-{0,1}* 

label' := e 

C' := Enc p fc(label', w; r') 
label := VK | Client | Server 

label' := e 
k^K- s := a(k, label', C") 
tk :=H k {\abe\',C',w) 

®H k , (label, C, w ) 
sk <— {0, 1}A c:= ECC(sk) 

A := tk© c 

Server \C'\s' 

k! ^K- s' := a{k', label, C) 

a Sign SK (C'|C'|s'|s|A) 

s\A\a 

if Vrfyv^CjCVIslA, a) = 1 : 
tk' := J7 fc (label',C",w) 

®H k , (label, C,w) 
sk := ECC _1 (tk' © A) 


Fig. 1. A 3-round PAKE protocol. The common session key is sk. 


defined in Section El A high-level overview of the protocol is given in Figure d 
a more detailed discussion follows. 

Setup. We assume a common reference string is established before any exe- 
cutions of the protocol take place. The common reference string consists of a 
public key pk for a CCA-secure encryption scheme (Gen, Enc, Dec) that has an 
associated ^-approximate smooth projective hash system (K. G, H = {H k : X —> 
{0, l} n }k eK , S, a : K x ({0, 1}* xC)-> S). We stress that no parties in the sys- 
tem need to hold the secret key corresponding to pk. 

Protocol execution. We now describe an execution of the protocol between an 
honest client Client and server Server, holding common password w. To begin, 
the client runs a key-generation algorithm /C for a one-time signature scheme 
to generate verification key VK and corresponding secret (signing) key SK. The 
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client sets label := VK|Client|Server and then encrypts the password w using this 
label to obtain ciphertext C. It then sends the message Client) VK|C to the server. 

Upon receiving the initial message Client|VK|C from the client, the server 
computes its own encryption of the password using label label 7 = e, resulting in 
a ciphertext C' . The server also chooses a random hash key k' <— K and then 
computes the projection s' := a(k', label, C). It sends C and s' to the client. 

After receiving the second protocol message from the server, the client chooses 
a random hash key k <— K and computes the projection s := a(k. label 7 , C"). 
At this point it computes a temporary session key tk = Lfy(label 7 , C', w) ® 
Hy (label, C,w), where ffy, (label 7 , C',w ) is computed using the known hash key k, 
and Hy(\abe\,C, w) is computed using the randomness r that was used to gen- 
erate C. (Recall that C is an honestly generated encryption of w.) Up to this 
point, the protocol follows the Gennaro-Lindell framework exactly. As will be- 
come clear, however, the server will not be able to recover tk but will instead 
only recover some value tk 7 that is close to tk; the rest of the client’s computation 
is aimed at repairing this defect. 

The client chooses a random session key sk e {0, 1}^ for some l to be specified. 
Let ECC : {0, l} f - — ► {0, 1}" be an error-correcting code correcting a 2e-fraction 
of errors. The client computes c := ECC(sk) and sets A := tk® c. Finally, it signs 
C\C'\s’\s\A and sends s, A, and the resulting signature o to the server. 

The server verifies a in the obvious way and rejects if the signature is invalid. 
Otherwise, the server computes a temporary session key tk 7 analogously to the 
way the client did: that is, the server sets tk 7 = ffy (label 7 . C', w)®Hy (label, C, w), 
where if*/ (label, C, w) is computed using the hash key k! known to the server, 
and Ufc(label 7 , C' , w) is computed using the randomness r 7 that was used to 
generate C’ . (Recall that C’ is an honestly generated encryption of w.) Finally, 
the server computes sk := ECC _ 1 (tk 7 ® A). 

Correctness. We now argue that, in an honest execution of the protocol, the 
client and server compute matching session keys with all but negligible probabil- 
ity. Approximate correctness of the smooth projective hash function implies that 
Ufc(label, C, w) as computed by the client is within Hamming distance en from 
Hfc(label, C. w) as computed by the server, except with negligible probability. 
The same holds for Hy (label 7 , C',w). Thus, with all but negligible probability 
we have Ham(tk, tk 7 ) < 2 e-n. Assuming this is the case we have 

Ham(tk 7 ® A, c) = Ham(tk 7 ® A, tk ® A) < 2s ■ n, 
and so ECC _ 1 (tk 7 ® A) = ECC _ 1 (c) = sk. 

Security. The proof of security of the protocol follows jTZEl closely; we sketch 
the main ideas. First, as in IEH, we note that for a passive adversary (i.e., 
one that simply observes interactions between the server and the client), the 
shared session-key is pseudorandom. This is simply because the transcript of 
each interaction consists of semantically-secure encryptions of the password w 
and the projected keys of the approximate SPH system. 

It remains to deal with active (man-in-the-middle) adversaries that modify 
the messages sent from the client to the server and back. The crux of our proof, 
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as in cast is a combination of the following two observations (for concreteness, 
consider an adversary that interacts with a client instance holding password w). 

- By the CCA-security of the encryption scheme, the probability that the ad- 
versary can construct a new ciphertext that decrypts to the client’s password 
w is at most q/\T>\ + negl(n), where q is the number of on-line attacks and 
V is the password dictionary. 

— If the adversary sends the client a ciphertext that does not decrypt to the 
client’s password, then the session-key computed by the client is statistically 
close to uniform conditioned on the adversary’s view. 

We defer a complete proof to the full version. 

Recalling the definitions from Section 0 note that correctness of the protocol 
relies on (approximate) correctness for honestly generated encryptions of the 
correct password (i.e., for x G L), whereas security requires smoothness for 
ciphertexts that do not decrypt to the correct password (i.e., for x 0 L). 

4 The Learning with Errors Problem 

The “learning with errors” (LWE) problem was introduced by Regev m as a 
generalization of the “learning parity with noise” problem. For positive integers 
n and q > 2, a vector s G Z£, and a probability distribution y on Z g , let A StX 
be the distribution obtained by choosing a vector a?ZJ uniformly at random 
and a noise term x *— y, and outputting (a, (a, s) + x) € Z™ x Z g . 

For an integer q = q(n) and an error distribution y = y(n) over Z g , the learn- 
ing with errors problem LWE 9iX is defined as follows: Given access to an oracle 
that outputs (polynomially many) samples from A s x for a uniformly random 
sgZJ, output s with noticeable probability. The decisional variant of the LWE 
problem, denoted distLWE 9iX , is to distinguish samples chosen according to 
for a uniformly random s G Z™ from samples chosen according to the uniform 
distribution over Z” x Z q . Regev m shows that for q = poly(n) prime, the LWE 
and distLWE problems are polynomially equivalent. 

Gaussian error distributions. For any r > 0, the density function of a one- 
dimensional Gaussian distribution over M is given by D r (x) = l/r-exp(—n(x/r) 2 ). 
In this work we always use a truncated Gaussian distribution, i.e., the Gaussian 
distribution D r whose support is restricted to x such that a; < r^fn. The trun- 
cated and non-truncated distributions are statistically close, and we drop the word 
“truncated” from now on. For 0 > 0, define d'g to be the distribution on Z q ob- 
tained by drawing y «— Dp and outputting [q ■ y] (mod q). We write LWE, ; g as 
an abbreviation for LWE g ^ . 

We also define the discrete Gaussian distribution over the integer lattice 

Z m , which assigns probability proportional to riie[m] D r {e-i) to each e G Z m . It 
is possible to efficiently sample from Djm r for any r > o HDj. 

Evidence for the hardness of LWE, ; g follows from results of Regev m, who 
gave a quantum reduction from approximating certain problems on n-dimensional 
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lattices in the worst case to within 0(n/ (3) factors to solving LWE 9 .g for dimension 
n, subject to the condition that p ■ q > 2 y/n. Recently, Peikert j2D| also gave a 
related classical reduction for similar parameters. For our purposes, we note that 
the LWE^g problem is believed to be hard (given the state-of-the-art in lattice 
algorithms) for any polynomial q and inverse-polynomial /3 (subject to the above 
condition). 

Matrix notation for LWE. In this paper, we view all our vectors as column 
vectors. At times, we find it convenient to describe the LWE problem LWE f;i g 
using a compact matrix notation: find s given (A, As + x), where A «— Z" i,xrl 
is chosen uniformly and x <— <Fg . We also use similar notation for the decision 
version distLWE. 

Connection to lattices. The LWE problem can be thought of as a “bounded- 
distance decoding problem” on a particular kind of m-dimensional lattice defined 
by the matrix A. Specifically, define the lattice 

A (A) = {y G Z m : 3s 6 Z n s.t. y = A T s (mod </)}. 

The LWE problem can then be restated as: given y which is the sum of a lattice 
point As and a short “noise vector” x, find the “closest” lattice vector s. One 
can show that as long as x is short (say, ||x|| < q/ 16), there is a unique closest 
vector to y (see, e.g., P|). 

4.1 Some Supporting Lemmas 

We present two technical lemmas regarding the LWE problem that will be used 
to prove smoothness of our (approximate) SPH systems in Sections 15.21 and 16.21 
If to > nlog q, the lattice /1(A) is quite sparse. In fact, we expect most vectors 
z G Z™ to be far from /1(A). The first lemma (originally shown in |23| 1 formalizes 
this intuition. 

Let dist(z, A(A)) denote the distance of the vector z from the lattice /1(A). 
The lemma shows that for most matrices A € Z™ Xrt , the fraction of vectors 
z G Z™ that are “very close” to /1(B) is “very small”. The proof is by proba- 
bilistic method, and appears in the full version. 

Lemma 1. Let n, q, m be integers such that m > nlog q. For all but a negligible 
fraction of matrices A, 

PrJdist(z,A(A)) < y/q/ 4] < g -( m +")/ 2 . 

Fix a number r > 0, and let e < — D%m r be drawn from the discrete Gaussian 
distribution over the integer lattice Z m . If the vector z is (close to) a linear 
combination of the columns of A, then given e T A one can (approximately) 
predict e T z. The second lemma shows a converse of this statement when r is 
large enough. Namely, it says that if z and all its non-zero multiples are far 
from the lattice /1(A), then e T A does not give any information about e T z. 
In other words, given e T A (where e <— £)*» ,. for a large enough r) e T z is 
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statistically close to random. This lemma was first shown in m3, and was used 
in the construction of an oblivious transfer protocol in El- 

More formally, for a matrix A £ Z™ xrl and a vector z £ Z™ , let Z\ r (A, z) 
denote the statistical distance between the uniform distribution on Z” +1 and 
the distribution of (e T A,e T z), where e <— Djn, qr . Then, 

Lemma 2. [H3 Lemma 6.3] Let r > yfg ■ w(-v/log n). Then for most matrices 
A £ Z™ Xn , the following is true: if z £ Z™ is such that for all non-zero a £ Z q , 
dist(oz, A(A)) > y/q/i, then A r ( A,z) < negl(n). 

5 Approximate Smooth Projective Hashing from Lattices 

As a warmup to our main result we first construct a CPA-secure encryption 
scheme with an approximate SPH system. The main ideas in our final construc- 
tion are already present here. 


5.1 A CPA-Secure Encryption Scheme 

The encryption scheme we use is a variant of the scheme presented in jl()B2t)j . 
and is based on the hardness of the LWE problem. We stress that the novelty of 
this work is in constructing an approximate SPH system for this scheme. 

We begin by describing a basic encryption scheme having decryption time ex- 
ponential in the message length 0 We then modify the scheme so that decryption 
can be done in polynomial time. 

The message space is h q for some integers q. i. In the basic encryption scheme, 
the public key consists of a matrix B £ Z™ x ", along with l+l vectors Uo, . . . , 
u/; £ Z"‘. To encrypt a message w = (w\ , . . . , wf) £ lf q the sender chooses a 
uniformly random vector s «— Z” and an error vector x «— ^ . The ciphertext is 


y = Bs 4- (uo + ^2 w % ' u») + x eZJ 1 

i=i 

The scheme is CPA-secure, since the distLWEg ^ assumption implies that the 
ciphertext is pseudorandom. 

The ciphertext produced by the encryption algorithm is a vector y such that 
y — (uo + X)i=i w i ' u 0 is “close” to the lattice /1(B) (the exact definition of 
“close” depends on the error parameter /3). Decrypting a ciphertext is done by 
finding (via exhaustive search over the message space) a message w for which 
y — (uo + JT=i w i ’ Uj) is “close” to A(B) , using the following trapdoor structure 
first discovered by Ajtai P, and later improved by Alwen and Peikert j2|. 

2 Interestingly, for our eventual application to PAKE a CCA-secure version of this 
scheme would suffice since the scheme has the property that it is possible to efficiently 
tell whether a given ciphertext is an encryption of a given message (and this is all 
that is needed to prove security for the protocol in Section 0 . 
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Lemma 3 (UK!)- Fix integers q > 2 and m > 4nlog 2 q. There is a ppt algo- 
rithm TrapSamp(l ri , q, m) that outputs matrices B £ Z™ x " and T £ Z mxm such 
that the distribution of B is statistically close to the uniform distribution over 
Z™ x ", and there is an algorithm BDDSolve(T, ■) that takes as input a vector 
z £ Z m and does the following: 

- if there is a vector s £ Z m such that dist(z,Bs (mod q)) < y/q/ 4, then the 
output is s. 

- if for every vector s £ Z m , dist(z,Bs) > y/q/4., then the output is _L. 

Proof. T is a full-rank matrix such that (a) each row of tq has bounded £2 norm, 
he., | |ti|| < 4-^/m, and (b) TB = 0 (mod q). jl!2j showed how to sample a 
pair (B , T) such that B is statistically close to uniform and T has the above 
properties. 

Given such a matrix T and a vector z e Z m , BDDSolve(T, z) works as follows: 

- first, compute zl = q- T -1 • [_(T • z) /q] (mod q). 

- Compute (using Gaussian elimination) a vector s £ Z™ such that z ' — Bs 
(if such exists; else, output J,). 

- If dist(z,Bs) < ydz/4, then output s else output _L. 

First, if z = Bs + x for some s€ZJ and x £ Z™ such that ||x|| < ydj/4, then 
the procedure above computes 

z! = q- T -1 • [(T • (Bs + x)) /q\ (mod q) = Bs (mod q) 

This is because each co-ordinate of Tx has magnitude at most ||T|| • ||x|| < 
4y/m ■ y/q/& <C q, and consequently, 


L(T • (Bs + x)) /q\ = L(T • Bs) /<?] = T • (Bs)/g 


where the final equality is because TB = 0 (mod q). 

Finally, if dist(z, A(B)) > ^/q/ 4, then the last line of the procedure above 


causes the output to be T always. 


□ 


We now modify the decryption algorithm in two ways. The first of these modifi- 
cations ensures that the decryption algorithm runs in polynomial time, and the 
second is needed for our approximate SPH system. 

First, to avoid the exponential dependence of the decryption time on the mes- 
sage length, we modify the encryption scheme by letting the public key contain 
the matrix A = [B|U], where the columns of U £ are ^j le vec ^ ors 

uo, . . . , u^. The secret-key is a trapdoor for the entire matrix A (as opposed to 
just B as in the previous description). The ciphertext from the previous descrip- 
tion can then be written as 
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and decryption uses the BDDSolve procedure from LemmaBIto recover the vec- 
tor (s,l,m). The crucial point is that, during key generation, the receiver can 
generate the matrix A along with an appropriate trapdoor for decryption. 

Secondly, we relax the decryption algorithm so that it finds an a G Z q and a 
message w for which a(y — (uo + X) i=1 Wi • u,)) is “close” to /1(B). This modified 
decryption algorithm correctly decrypts the ciphertexts generated by Enc (which 
corresponds to the case a = 1), but it also decrypts ciphertexts that would never 
be output by Enc. This modification to the decryption algorithm enables us to 
prove smoothness for the approximate SPH system. 

Parameters. Let n be the security parameter, and t = n be the message length. 
The parameters of the system are a prime q = q(n,£), a positive integer m = 
m(n,t ), and a Gaussian error parameter (3 = j3(n, i) G (0,1] that defines a 
distribution P 3. For concrete instantiations of these parameters, see Theorem 0 

We now describe the scheme: 

Key generation. Choose a matrix A G jrnx(n+e+\) together w jth the trap- 
door T by running (A,T) 4 — TrapSamp(l m , l ra+ ^ +1 , 5), where TrapSamp is as 
described in Lemma B1 Let the public key be A and the secret-key is T. 
Encryption. To encrypt the message w G Z^ with respect to a public key as 
above, the sender chooses s *— Z” uniformly at random, and an error vector 
x <— Pa . The ciphertext is 



(mod q ) 


Decryption. The decryption algorithm works as below. 


for a = 1 to q — 1 do 

( S \ 

Compute a 7 I «— BDDSolve(T, ay) 

if a' = a then 

output w/a and stop 
else try the next value of a 

end 

If the above fails for all a, output Ju 


Theorem 1. Let n,£,m,q,/3 be chosen such that m > 4(n + f?)logg and (3 < 
1/(2 ■ m 2 n ■ w(v 7 Iogn)). Then the scheme above is a CCA-secure encryption 
scheme assuming the hardness of distLWE n:m 9 . ( g. 

5.2 An Approximate SPH System 

Fix a public key A G Z™ x f or the system (where we write A = [B|U], as 

usual), and a dictionary V = f Z^. Sets X, L m and L m are defined in Sectional 
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(For our purposes, all vectors y £ Z™ are valid ciphertexts). Let r be such that 


y/q-uj(- y/log n) <r< e/(8 • mn 2 ■ (3 ). 


(Looking ahead, we remark that the upper bound on r will be used for correct- 
ness, and the lower bound will be used for smoothness.) 

A key for the SPH system is a fc-tuple of vectors (ei,...,e*) where each 
e,; <— Djm r is drawn independently from the discrete Gaussian distribution. The 
reader may want to keep in mind the inverse relationship between the parameters 
r and /3: the larger the error parameter (3 in the encryption scheme, the smaller 
the discrete-Gaussian radius r (and vice versa). 

1. The projection set S = f (Z”) A: . For akey (ei, . . . , e*,) £ (Z™) fe , the projection 
is a(ei, . . . , ejt) = (ui, . . . , u*), where u, = B T ej. 

2. We now define the smooth projective hash function H = {HkjkeK- On input 
a key (ei, . . . , e*,) £ K and a ciphertext c = (label, y,m), the hash function 
is computed as follows. First compute 



Treat Zi as a number in [—{q— 1)/2 . . . (q— 1)/2] and output b%... bk £ {0, l} fe 


where 



3. On input a projected key (m,...,Uk) £ S, a ciphertext c = (label, y,m) 
and a witness seZJ for the ciphertext, the hash function is computed as 
fLu( c, s) = bi . . . bk where 



0 if ufs < 0 

1 if uf s > 0 


Theorem 2. Let the parameters n, t, m, q. (3 be as in Theorem 0 and r be as 
above. Then, Tt = {Hk}keK is an c- approximate smooth projective hash system. 

Proof. Clearly, the following procedures can all be done in polynomial time: 

(1) sampling a uniform key for the hash function (ei,...,efc) (D'/m r ) k , 

(2) computing the hash function H on input the key (ei, . . . ,efc) and a cipher- 
text c, (3) computing the projection-key a(ei, . . . , e*,), and (4) computing the 
hash function given the projected key (ui, . . . , Ufc), a ciphertext c, and a witness 
s for the ciphertext c. 

Approximate correctness. We now show e-approximate correctness. Consider 
any (label, y, m) £ L, i.e., where y is a ciphertext produced by the encryption 
algorithm on input the message m. This means that y can be written as 



(mod q ) 


(1) 


where ||x|| < (3q ■ sjmn (recall we work with truncated Gaussians). 
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We first show that for each i £ [ft] , the values z,; (computed using the key) and 
s T Uj (computed using the projected key) are “close”. More precisely, we show 
that | Zi — nj s < e/2 • (q/ 4). This follows because 


\zi — ufs| = |(ef (Bs + x) — ufs| = |efx|, 


(2) 


where the first equality uses the fact that y can be written as in Equation (PJ. 
and the second uses the fact that Uj = efB. Now, |efx| < | |e* 1 1 - 1 |x| | < (r/mn) • 
(fay/nm) < e/2 • q/A. 

Each u, is statistically close to uniform, by an application of the leftover hash 
lemma; in particular, this means that s T u,; £ Z q is uniformly random Q Let bi 
be the i th bit of H/ eij efe )(c) and be the i th bit of H[ Ul u ^(c,s). Using 
Equation 0. we see that the probability that bi ^ ty (over the randomness 
of e,) is at most e/2. Thus, by a Chernoff bound, the Hamming distance between 
iJ(ei i ...,e fe )( c ) arid Ufe) (c,s) is at most ek with overwhelming probability. 

This shows approximate correctness. 

Smoothness. Consider any (label, y,m) £ X\L. By definition of L, this means 
that the decryption algorithm, on input (label, y, m) and any possible secret key 
sk, does not output m. In other words, the decryption algorithm outputs either 
JL> or a message m' ^ m. Define 



We will show that for every non-zero a g Z„ az is far from the yl(B). More 
precisely, we will show that for every non-zero a £ Z g , 


dist(az, T(B)) > y^/4. 


An application of LemmaElthen shows that for every i £ [k], the pair (ef B , ej z) 
is statistically close to the uniform distribution over Z” +1 . 

Let us analyze the two cases: 

— The output of the decryption algorithm is _L. In particular, this means that 
for every a £ [1 ... q — 1], the vector az is far from A(B). 

- The output of the decryption algorithm is a message m' ^ m. This could 
happen only if there is an a' £ Z q such that a'z' is close to the lattice /1(B). 
Suppose, for contradiction, that az is close to A(B) as well. The claim below 
shows that this cannot happen with high probability over the random choice 
of U. Thus, with high probability, az is far from /1(B). 

Claim. The following event happens with negligible probability over the uni- 
formly random choice of U G there exist numbers a, a' £ Z q , vectors 

m / m' € Z* and a vector y £ Z™ s.t. 


dist(az, A(B)) < y/q/A and dist(a , z / , A(B)) < y/q/A. 


This holds only for s ^ 0. We omit consideration of this technical it 
purposes of this paper. 


for the 
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Proof. Fix some a, a! G Z 9 , m 7 ^ m' G and y G Z™. We first observe 
that since the vectors and are li neal "ly independent and U 

is uniformly random, the vectors az and a'z! are uniformly random and 
(statistically) independent. Applying Lemma [fl we get that 


Pr UeZ m x «[dist(az, A(B))v^/4 and d ist(a'z', A(B)) < ^/q/ 4] 
< (q-™/ 2 • negl(n )) 2 = q~ m ■ negl(n). 


Now, an application of union bound shows that the required probability is 
at most q 2 ■ q' M ■ q m ■ (q~ m ■ negl(n)), which is negligible in n. □ 

This completes the proof of Theorem |5| □ 


6 A CCA-Secure Encryption Scheme Based on Lattices 

In this section we describe a CCA-secure encryption scheme, along with an 
approximate SPH system, based on the hardness of the LWE problem. The CCA- 
secure encryption scheme builds on the CPA-secure encryption scheme described 
in Section PHI and the SPH system is the same as the one from Section IH~21 wit,h 
a few modifications. 


6.1 A CCA-Secure Encryption Scheme 

The encryption scheme is similar to the schemes in j 21 )H 2 j (which, themselves, 
are instantiations of the general construction of Rosen and Segev |23). The main 
difference between j'iOH'ij and our scheme is the relaxed notion of decryption, 
which we already use in the CPA-secure construction in Section 15. II A formal 
description of the scheme follows. 

Parameters. Let n be the security parameter, and £ = poly(n) be the message 
length. The parameters of the system are a prime q = q(n,£ ), an integer m = 
m(n,£) G Z+, and a Gaussian error parameter 8 = /3(n,£) G (0, 1] that defines a 
distribution Wp. For concrete instantiations of these parameters, see Theorem 01 

Key generation. For i G [n] and b G {0,1}, choose 2 n matrices A *— 
Z“(" +<+ i) together with short bases Sy, G Z mxm for A J -(A ii b). More pre- 
cisely, let 

(A ijh , Si, fe ) <- TrapSamp(l m , l n+tw , q), 

where TrapSamp is as described in Lemma 01 Output the public and secret keys 

pk = {Aj )0 , and sk = {Sqoj Si,i}. 

(Note that the receiver does not use the trapdoors for i > 1 and so the {A, ; 
could, in fact, simply be chosen at random.) 
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Encryption. To encrypt the message w £ T} q with respect to a public key as 
above, the sender first generates a key pair (VK, SK) <— SigKeyGenfl") for a one- 
time signature scheme; let VK = VKi, . . . , VK„ denote the bits of the verification 
key. Define the matrix Avk as 


Ai 


,VKi 


A 


VK — 


Choose s <— Z™ uniformly at random, and choose an error vector x <— W£ n . The 
ciphertext is (VK,y,cr) where 


y = A V k • 


(mod q) 


and cr = Sign SK (y). 

Decryption. To decrypt a ciphertext (VK,y, cr), first verify that u is a correct 
signature on y and output T if not. Otherwise, parse y into n consecutive blocks 
yi, . . . , y n , where y, e Z™. Then, 


for a = 1 to q — 1 do 
Compute t := 

if a' = a then 


BDDSolve(Ti i vKi , ay) 


if IjA^vKi • t — ayi\\ < y/q/ 4 for all i € [n] then 
output w /a and stop 
else try the next value of a 

end 

If the above fails for all a, output JL 


Theorem 3. Let n,l,m,q,(3 be such that m > 4(n+ £)log 2 q and /3 < 1/(2 • 
m?n ■ w(-\/log n)). Then, the scheme above is a CCA-secure encryption scheme 
assuming the hardness of distLWE„ imigi/ g. 

The proof of correctness is similar to that of the CPA-secure encryption scheme. 
CCA-security follows from the ideas of |2()ll 2| . As we observed, the main change 
between our encryption scheme and the one in j2W12| is that the decryption 
algorithm tries to decrypt “all multiples of the ciphertext” . We defer the details 
of the proof to the full version. 

6.2 An Approximate SPH System 

Fix a public key {A ij0 , A, ; -| },; e [ n ], and a password dictionary V = f Z e q . The main 
difference from the presentation in Section 15.21 is in the definition of cipher- 
text validity: now, a labeled ciphertext (label, VK, y, cr) is defined to be valid 
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if Verify VK (label||y, a) = accept. Clearly, all honestly generated ciphertexts are 
valid and this condition can be checked in polynomial time. We define the sets 
X, L m , and L m for meD exactly as in Section El 

As in Section IH"21 a hash key is a fc-tuple of vectors (ei, . . . , e*,) where each 
e* *— Djjn r is drawn independently from the discrete Gaussian distribution. 
The projection function and the hash computation are the same, except that 
here they use the matrices B V k and Uvk respectively (instead of B and U in 
Section ^21 . In particular, this means that the projection function depends on 
the ciphertext (as allowed by the definition of an approximate SPH). The proof 
of the theorem below follows analogously to that of Theorem El we defer the 
proof to the full version of this paper. 

Theorem 4. Let m > 4 (n + €) log <7, /3 < 1/(2 • m 2 n • w(v^ogn)) and r be such 

that 

y/log n) <r< e/(8 • mn 2 ■ /3). 

Then Tt = {Hk}keK is an e-approximate smooth projective hash system. 
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Abstract. A fault attack consists in inducing hardware malfunctions in 
order to recover secrets from electronic devices. One of the most famous 
fault attack is Bellcore’s attack against RSA with CRT; it consists in 
inducing a fault modulo p but not modulo q at signature generation 
step; then by taking a gcd the attacker can recover the factorization of 
N = pq. The Bellcore attack applies to any encoding function that is 
deterministic, for example FDH. Recently, the attack was extended to 
randomized encodings based on the iso/iec 9796-2 signature standard. 
Extending the attack to other randomized encodings remains an open 
problem. 

In this paper, we show that the Bellcore attack cannot be applied to 
the PSS encoding; namely we show that PSS is provably secure against 
random fault attacks in the random oracle model, assuming that invert- 
ing RSA is hard. 

Keywords: Probabilistic Signature Scheme, Provable Security, Fault 
Attacks, Bellcore Attack. 


1 Introduction 

RSA HU is still the most widely used signature scheme in practical applications. 
To sign a message m with RSA, the signer first applies an encoding function /i to 
to, and then computes the signature o = fj,(m) d mod N. The signature is verified 
by checking that cr e = //(to) mod N. For efficiency reasons RSA signatures are 
often computed using the Chinese Remainder Theorem (crt); in this case the 
signature is first computed modulo p and q separately: 


m d mod p 


m d mod q 


and then <j p and cr q are combined by CRT to form the signature cr. 

Boneh, DeMillo and Lipton showed that RSA signatures computed with CRT 
can be vulnerable to fault attacks 0 ■ If the attacker can induce a fault when o q 
is computed while keeping the computation of a p correct, one obtains: 



mod p , a q ^ m d mod q 


and the resulting faulty signature a satisfies 


cr e = to mod p , cr e ^ m mod q . 


M. Matsui (Ed.): ASIACRYPT 2009, LNCS 5912, pp. 653 |666| 
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Therefore, given one faulty signature cr, the attacker can recover the factorization 
of N by computing gcd(cr e — to mod N, N) = p. This attack actually applies 
to any deterministic RSA encoding, e.g. Full Domain Hash (fdh) |2) with a = 
H(m) d mod N. 

More generally, the attack applies to any probabilistic scheme where the ran- 
dom used to generate the signature is sent along with the signature, e.g. as in the 
Probabilistic Full Domain Hash (pfdh) encoding pj where the signature is cr||r 
with cr = H(m | r) d mod N. In that case, given the faulty value of cr and knowing 
r, the attacker can still factor N by computing gcd (o e —H(m || r) mod N, N ) = p. 

However, if the random r is not given to the attacker along with the signature 
cr then the Bellcore attack is thwarted. This is the case for signatures of the 
form cr = /i(m, r) d mod N where the random r is only recovered when verifying 
the signature, as in pss j2|. To recover r one needs a correct signature; from 
a faulty signature, the attacker cannot retrieve r nor infer /z(m, r) in order to 
compute gcd(cr e — p(m, r) mod N, N) = p, unless r is short enough to be guessed 
by exhaustive search. Note that obtaining another correct signature for to would 
not help the attacker since with high probability a different random r' would be 
used to generate this signature. 

Recently, it was shown how to extend Bellcore’s attack to a large class of 
randomized RSA encoding schemes |ZJ . The extended attack was illustrated with 
the iso/iec 9796-2 standard jTTI . iso/iec 9796-2 is originally a deterministic 
encoding scheme but often used in combination with message randomization, as 
in the emv standard |Bj- The iso/iec 9796-2 encoded message has the form 

p(m) = 6A 16 [| to[1] I H(m) || BC 16 

where to = m[l] || to[ 2] is split into two parts. The authors of [Zj showed that if 
the randomness introduced into m[l] is not too large (e.g. less than 160 bits for 
a 2048-bit RSA modulus), then a single faulty signature allows to factor N as 
in the original Bellcore attack. The attack is based on Coppersmith’s technique 
for finding small roots of polynomial equations jS], which is based on the LLL 
algorithm m 

However, extending the attack to other randomized RSA signatures remains 
an open problem. In particular, it is natural to ask whether the Bellcore attack 
could apply to PSS 0, the most popular RSA-based signature scheme. In this 
paper, we show that the Bellcore attack cannot be extended to PSS; namely we 
show that PSS is provably secure against random fault attacks in the random 
oracle model, assuming that inverting RSA is hard. 

More precisely, we consider an extended model of security in which the at- 
tacker, in addition to the regular signing oracle, has access to a faulty signature 
oracle; that is, the attacker can request faulty signatures either modulo p or 
modulo q. For a faulty signature modulo q, the signer first generates the correct 
value modulo p: 

o p = p(m,r) d mod p 

but generates a random o q modulo q. With CRT the signer then computes o' 
such that o' = o p mod p and o' = o q mod q, and returns the faulty signature 
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o' to the adversary. Our result is that PSS is still secure under this extended 
notion of security, in the random oracle model, assuming that inverting RSA is 
hard. 

2 Security Model 

We recall the definition of a signature scheme. 

Definition 1 (signature scheme). A signature scheme (Gen, Sign, Verify) is 
defined as follows: 

- The key generation algorithm Gen is a probabilistic algorithm which given 
l k , outputs a pair of matching public and private keys, ( pk , sk). 

- The signing algorithm Sign takes the message M to be signed, the public key 
pk and the private key sk, and returns a signature x = Sign sfc (M). The signing 
algorithm may be probabilistic. 

- The verification algorithm Verify takes a message M , a candidate sig- 
nature x' and pk. It returns a bit Verify pk (M,x'), equal to one if the signa- 
ture is accepted, and zero otherwise. We require that if x <— Sign sk (M), then 
Verify pk (M,x) = 1. 

In the existential unforgeability under an adaptive chosen message attack sce- 
nario, the forger can dynamically obtain signatures of messages of his choice and 
attempts to output a valid forgery. A valid forgery is a message/signature pair 
{M,x) such that Verify pk (M,x) = 1 whereas the signature of M was never 
requested by the forger. 

In the following, we consider an extended model of security in which the 
attacker, in addition to the regular signing oracle, has access to a faulty signature 
oracle; that is, the attacker can request faulty signatures either modulo p or 
modulo q. For a faulty signature modulo q, the signer first generates the correct 
value modulo p: 

o p = p(m, r) d mod p 

and generates a random o q modulo q. With CRT the signer then computes o' 
such that o' = o p mod p and o' = o q mod q, and returns the faulty signature 
o' to the adversary. This is actually equivalent to first computing a correct 
signature o: 

o = p(m,r) d mod N 

and then generating a random u modulo q and computing the faulty signature: 
o' = o + u-p mod N 

Formally, we consider the following scenario between a challenger and an at- 
tacker. Our scenario applies to any RSA based signature scheme in which a 
signature o is computed as o = p(m, r) d mod N for some (randomized) encod- 
ing function p(m,r). 
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Setup: the challenger generates an RSA modulus N = p ■ q, a public exponent 
e such that gcd(e, <p(N)) = 1 and a private exponent d such that e ■ d = 1 
mod <j>(N ). The challenger sends (N,e) to the adversary. 

Queries: the adversary can make regular signature queries to the challenger. In 
this case, given a message m, the challenger generates a random r and output 
the (correct) signature: 

a = n(m, r) d mod N 

Additionally, the attacker can make faulty signature queries. For every such 
query, the attacker specifies whether the fault should be modulo p or modulo q. 
For a faulty signature modulo q, the challenger first generates a random r and 
computes the correct signature: 

cr = p(m, r) d mod N 

Then the challenger generates a random u modulo q, and computes: 

o' = o + u ■ p mod N 

and sends o' to the attacker. The challenger proceeds similarly if a faulty signa- 
ture modulo p is requested. 

Forgery: eventually the attacker must output a forgery, that is a message signa- 
ture pair ( m,x ) such that Verify pfe (m, x) = 1 whereas the signature of m was 
never requested by the forger, neither as a regular signature query nor in a faulty 
signature query. 

This completes the description of the attack scenario. As usual, we say that a 
signature scheme is ( t , e)-secure if no adversary running in time t can output a 
forgery with probability better than e. 

The PSS scheme was proven secure in the random oracle model [I], and our 
security proof with faulty signatures is also in the random oracle model. It is 
well known that a security proof in the random oracle model does not necessarily 
imply that a scheme is secure in the real world (see 0). Although it is always 
better to have a security proof in the standard model, we think that it is still 
better to have a proof in the random oracle model than no proof at all. 

2.1 Why Random Faults? 

In our security model we have assumed that when a faulty signature o’ is ob- 
tained, it has the uniform distribution modulo p (or modulo q). This could be 
seen as a very strong assumption; namely in practice the faults might have a 
completely non-random distribution. Consider for example a fault attack induc- 
ing the values of the registers to be set to zero. This gives o p = 0 and recovering 
p is then straightforward: simply compute gcd((j / , N) = p. To prevent from this 
attack we could assume that when a fault occurs the value o p still has enough 
min-entropy. 

In the following we argue that 1) the random fault assumption is almost 
unavoidable if we want to obtain a security proof and 2) such assumption might 
actually be reasonable in practice. 
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Assume that a fault gives a random o p mod p but with the k most significant 
bits set to 0, for some small integer k. That is, the attacker can obtain a list of 
faulty signatures such that the corresponding a' i p = a\ mod p satisfy: 

°s<p<fr C 1 ) 

for all 1 < i < n, where n is the number of faulty signatures. We show how to 
recover p, using an attack similar to [T3|. With LLL fT2j, the attacker computes 
a short vector (ui, . . . , u n ) such that: 

^ m ■ a • = 0 mod N 

This implies: 

Ui ■ (j\ v = 0 mod p 

Since from o the a' i p are small modulo p, if the Ui s are small enough, then the 
equality will hold not only modulo p but also over Z: 

X>i-<P = ° 

This gives a vector (ui , . . . , u n ) that is orthogonal in Z to the unknown vector 
(cr^p, . . . ^n p). It is shown in m that by generating sufficiently many such 
vectors, one can recover the unknown vector (er} jP , . . . u' n p ) and eventually p. 

Note that this attack applies to any RSA-based signature scheme with CRT, 
not only to PSS. This attack shows it is not enough for o p to have min-entropy, 
as only a few bits of entropy loss compared to the uniform distribution enable 
to recover p. Therefore, if we want to obtain a security proof, it seems necessary 
to assume that a p is uniformly distributed modulo p. 

Actually the random fault assumption might be reasonable in practice. Name- 
ly to prevent probing attacks, the data being transmitted in the memory bus 
inside the micro-processor is usually encrypted. Therefore, the content of a regis- 
ter after a fault attack could still be some encrypted value, so it can be reasonable 
to model this register value as uniformly random. 

3 PSS Is Secure against Random Fault Attacks 

3.1 The PSS Scheme 

We recall the definition of the PSS scheme j2j. The scheme uses three hash 
functions h : {0,1}* — > {0,l} fcl , gi : {0,l} fcl — > {0,l} fc ° and g 2 : {0,l} fel — > 
{0, l} fc_fco_fel_1 , where k, ko and ki are parameters. 

Key Generation: generate a fc-bit RSA modulus N = pq, and a random ex- 
ponent e G Z^^y Generate d such that e ■ d 1 mod <f)(N). The public-key is 
(iV, e); the private key is (N, d). 
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Signature generation: given a message m, do the following: 

1. r «— {0, l} fe ° 

2. a ; h{m\\r) 

3. r* *- gi(uj) ® r 

4. y *- 0\\u;\\r*\\g2(u>) 

5. Return a = y d mod N 

Signature Verification: given a message m and a signature a , do the following: 

1 . Let y = cr e mod N 

2. Parse y as 0||u;||r*||7. If the parsing fails return 0. 

3. 

4. If h(m\\r) = uj and g% (u>) = 7 return 1. 

5. else return 0. 

3.2 Security Proof 

We first give an intuition of the proof. We denote by p(m, r ) the PSS encoding 
scheme, that is p(m, r) = 0||u;||r*||c/2(u>) where u = h(m\\r) and r* = gi(u) ® r. 

We receive as input a challenge (N, e, rj) and we must output rf mod N. In 
the original PSS security proof j2| , when receiving a signature query, the simulator 
generates a random a modulo N such that a e mod N can be written as 0 1 1 u> \ \ s \ \ t. 
The simulator generates a random r of ko bits. Then it lets h(m, r) = w, gi{u) = 
s®r and <72 (w) = t. Therefore we have that yirn, r) = ( a e mod N) . The simulator 
can then return a as a signature for m. When receiving a hash query for h(m , r), 
the simulator generates a random a modulo N such that 7/ • a e can be written as 
0||u;||s|[t; it then proceeds as previously. In this case we have p(m, r) = (r/ • a e 
mod N). Therefore a forgery for y(m. r) enables to compute rf 1 mod N. 

One can see that if there is no collision on the randoms r used for signature 
generation, and no collision on the values u, then the simulation is perfect. Then 
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given a forgery o' for some message rri' , with high probability we have that 
n(m! ,r') = (rj ■ a e mod N) for some known a. Therefore from o' = fi(m ' , r , ) d 
mod N one can compute rf mod N as required and solve the RSA challenge. 

In our extended model of security, we must additionally simulate a faulty 
signature oracle. To do this, one could first generate as previously a random 
a modulo N such that a e mod N can be written as 0||u/||s||t. The simulator 
generates a random r of ko bits. Then it lets h(M,r) = u>, gi(oj) = s ® r and 
<72 (w) = t, so that again //(to, r) = (a e mod N). Then instead of returning the 
correct signature a, the simulator could generate a random u modulo q, and 
output the faulty signature: 

a! = a + u ■ p mod N (2) 

Obviously our simulator cannot do this, because it does not know the prime 
factors p and q. Instead we show that the distribution of a! is statistically close 
to uniform in Z,y: therefore, the simulator can simply return a random a ' £ Z,y. 

Since RSA is a permutation, instead of considering the distribution of a', one 
can consider the distribution of y' = a"' mod N . From 0 we have: 

y' = U 4 * V ■ p mod N 

where v is uniformly distributed modulo q and y is uniformly distributed in 
[0, 2 fe— 1 [. The following lemma shows that the distribution of y' is statistically 
close to uniform in Zjv- 

Lemma 1. Let N = pq be a k-bit modulus where p and q are k/2-bit, and let y 
be a random integer such that 0 < y < 2 fc_1 . Let v be a random integer modulo q. 
Then the distribution of y' = y + v ■ p mod N is e- statistically close to uniform 
modulo N, with e = ^2 

Proof. We consider a fixed a £ Z N and we provide an estimate of Pr [y' = a}. 
For this we consider the solutions of the equation: 

a = y + v ■ p mod N (3) 

We have that for every v £ [0, q). there exists a unique y £ [0, N[ which satis- 
fies the above relation. However we are only interested in the j/’s in the range 
[0, 2 fe-1 [. We have that for each i £ [1, g], the pair: 

(v = q — i, y = a + ip mod N) 

is a solution of @ iff 

a + ip mod N < 2 fc_1 (4) 

Depending on the choice of a, there are actually either J or J + 1 
many i values which satisfy relation 0. Hence there are or \ 

many solutions to congruence ® such that y < 2 fc_1 . Since y and v are random 
integers in the range [0,2 fe_1 ) and [0, q) respectively, this gives: 
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Pr W = a] ; 


We write \r ~^~ J = c, which gives p ■ c <2 k 1 < p ■ c + p. We obtain: 

c 1 pc If 2 k ~ 1 — pc\ 

- 2 k ~ 1 q ~ N ' 2^1 ~N\ 2 fc " 1 J 

> S'( 1_ 2^) (as 2 fr_1 < pc + p) 


»T ' 


Similarly, we have: 


Prb' = «]<(l + ?)-i 

This gives: 

for all a e [0. A 7 ). This implies that the distribution of y' is ^^-statistically 
close to uniform modulo N as q > 2 fe/2 ~ l . □ 


Lemma 0 shows that it is sufficient for our simulator to return a random of 
modulo N as the faulty signature. In other words, instead of first generating a 
random y G [0, 2 fc_1 ), then a random v modulo q, then y' = y + v -p and finally 
of = y' d mod N, the simulator can simply output a random of modulo N, and 
such output will be statistically indistinguishable from a faulty signature. 

However to this faulty signature of corresponds a correct signature a such 
that: 

a = of — u ■ p mod N 

where u is randomly distributed modulo q. Equivalently letting y' = o! e mod N 
there exists a corresponding value y with: 


y = y' — v ■ p mod N 


(5) 


where v is randomly distributed modulo q such that y can be written 


y = 0||w||s||t = p(m,r) 

This implicitly defines h(m , r) = oj, gi(io) = s©r and <72 (w) = t for the simulation 
of random oracles h, gi and <72- 

Since our simulator does not know p, it cannot compute y in equation (0 
and therefore our simulator does not known the corresponding values of w, s 
and t: therefore our simulator cannot answer the corresponding h queries, g\ 
queries and <72 queries if such queries are made by the attacker. Intuitively for 
/i-queries it is sufficient that the set of r values is exponentially large; for this 
the parameter ko must be large enough. For <71 and <72 queries we must show 
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that the adversary has a negligible probability of querying u). This is shown in 
the following lemma: we show that given a faulty signature a' (or equivalently 
y' = a! e mod N) the distribution of w has enough variability, if the parameter 
ki is sufficiently large. This implies that co does not need to be computed, and 
therefore the factorization of N is not needed for our simulation. 

Lemma 2. Let N = pq be a k-bit modulus where p and q are k/2-bit, and let y 
be a random integer such that 0 < y < 2 k ~ 1 . Let v be a random integer modulo 
q, and let y' = y + v ■ p mod N. Write y = 0||w||a: where w is ki-bit and x is 
k — ki — 1 bits. Given y' , for any u/ of ki-bit we have: 

Pr[u> = oj'\y'] < 2mi J i!fc/2) 


Proof. We have that: 

pr = ,, #(y, v) pahs, s.t. y' = y + v-p mod N and y = g|ja/||x 

U W ^ #(y, v) pairs, s.t. y' = y+v ■ p mod N and 0 < y < 2 k ~ 1 

For a fixed v, the value y mod N gets fixed by the relation y' = y+v-p mod N. 
Moreover at least |_§ J of the possible v values give y mod N in the desired range 
between 0 and 2 k ~ 1 . Hence the denominator of the above fraction can be lower 
bounded by [fj- 

We have that for a fixed y ' , the value of y is fixed modulo p; hence for a fixed 
uS with y = 0[|u/||a;, the value of x is also fixed modulo p. As x is k — ki — 1- 
bit, over Z there can be at most — — — ] many possible x values. Hence the 
numerator of the above fraction can be upper bounded by [ . 

Hence we have, 


Pr[w 


"V] < 




2 fc / 2 - 2 


2 fe-fei-i + 2 fc /2-i __ 8 

2^ 3 < 2 min(fc 1 ,fc/2) 


□ 


Formally, we obtain the following theorem: 

Theorem 1. Assume that no algorithm can invert RSA in time if with proba- 
bility better than e' . Then the signature scheme PSS[/co, &i] is (t, qh , q g , q s ,Qfs ,z) 
secure, where 

t(k) = tf(k) - [&(&) + q g (k) + qh{k) -f 1] • k Q ■ G(k 3 ) 

e(k) = e'(k ) + ( q s + qf s + 1) ■ ( q s + qf s + qh) -2 k ° + 8 • q g ■ qf s ■ 2 min ( k i> k / 2 ) 
+ ( Qh + Qs + Qfs ) ■ ( Qh + Qg + Qs + Qfs + 1 ) ■ 2 kl 
+ q h -q fs -2- k ° +4- q fs -2- k / 2 


Here the attacker can make at most qh,Qg,Qs, Qfs number ofh queries, g queries, 
signature queries and fault signature queries respectively. 
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Proof. We use a simulator which behaves in exactly same way as in original PSS 
security proof j2|, in addition it answers fault queries with a uniformly random 
integer modulo N. Now if the attacker is successful against our simulator then 
we break the RSA challenge (N, e, rj) as in the original paper. 

We must show that any attacker which is successful against the original attack 
scenario will be successful against our simulator. For that, we use a sequence of 
games. We start with Gameo, which is exactly the attack scenario, which requires 
to know the factorization of N. Then we progressively modify the game, so that 
eventually knowledge of the factorization of N is not needed anymore. We denote 
by Si the event that the attacker succeeds in Game,;. 

Game 0 : this is the attack scenario. We answer signature queries as specified in 
the signature generation algorithm, using the private exponent d. We simulate 
the faulty signature queries by first generating a correct signature a and then 
computing a' = a + u ■ p mod N for a random u modulo q. In the following for 
simplicity we only consider faulty signatures modulo q; faulty signatures modulo 
p are simulated in exactly the same way. 

Gamei: we abort if there is a collision for w at Step |2| of the signature generation 
algorithm, or if the random r used during signature generation has already ap- 
peared before. We call this event A\. More precisely event A\ happens if one of 
the following is true: 

— The random r used in a signature oracle or faulty signature oracle query 
collides with either 1) the r used in a previous signature oracle or faulty 
signature oracle query or 2) the r used in a previous h oracle query. 

— The h function output in a signature oracle or faulty signature oracle query 
collides with either 1) the h function outputs in previous signature oracle or 
faulty signature oracle queries or 2) with a previous h oracle query output 
or 3) a previous g oracle query input. 

— The h oracle query output collides with either 1) a h function output in 
previous signature oracle or faulty signature oracle query or 2) a previous h 
oracle query output or 3) a previous g oracle query input. 

We obtain: 

Pr[vli] < {q s + qfs) • (q> + Qfs + qh ) • 2“ fc ° + ( q h + q s + qfs) • {qh + q g + q s + qfs) • 2“ fel 
and: 

p>r[5 t ]-Pr[S 0 ]l<Pr[Aii 

Game 2 : we construct a similar simulator as in the original PSS security proof |2J; 
however to deal with faulty signature queries we continue to use the factorization 
of N. 

The simulator receives as input a challenge rj and must output rj d mod N. 
When receiving a signature query, the simulator generates a random a modulo 
N such that a e mod N can be written as 0||cj||.s||t. The simulator generates a 
random r of ko bits. Then it lets h(m,r) = u, gi(u) = s © r and <? 2 (w) = t. 
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When receiving a hash query for h(m,r), the challenger generates a random 
a modulo N such that r/ • a e mod N can be written as 0 ||w||s||i; it then defines 
h(m,r) = u, gi(oj) = s ® r and g% (u>) = t as previously. The queries to <71 and 
eg'! are simulated by returning a random value for every new input. 

To simulate the faulty signature oracle, one first generates as above a random 
a modulo N such that a e mod N can be written as 0 ||cj||.s||f. The simulator 
generates a random r of ko bits. Then it lets h(m,r) = u, 51 (w) = s ® r and 
g-2(uj) = t. Then instead of returning a, the simulator generates a random u 
modulo q, and outputs: 

ol = a + u ■ p mod N (6) 

In Game2 we abort as in Gamei, and additionally in the following case: while gen- 
erating a random a modulo N such that a e mod N can be written as 0 ||cj||.s||f 
during signature or faulty signature queries (and similarly for h(m,r ) queries), 
we stop after trying ko + 1 times. This adds (qn + q s + qf s ) • 2 _fc ° in the error 
term: 

| Pr[S 2 ] - Pr[Si] \<(q h + q s + q fs ) * 2 “ fc ° 

Game 3 : we abort if the attacker makes a query for g(uj) where u was used in a 
faulty signature for message m and random r, while the attacker has not made 
a query to h(m, r) before. We define this event as A3. As all the query answers 
are simulated independently, from Lemma El this gives: 

| Pr[Sa] - Pr[S 2 ]| < Pr[A 3 ] < q g ■ q fs ■ 2min( fc i<fc/a) 

Game4: we abort if the attacker makes a query for h(m, r ) where r was used to 
generate a faulty signature with u, while the attacker has not made a query 
before to g(oj)- In this case the attacker’s view is independent from r, which 
gives: 

I Pr^] — Pr[ 5 3 ] \ < qh ■ Qfs m 2 _fe ° 

Game 5 : we abort if the attacker makes a query for h(m,r) where r was used to 
generate a faulty signature, or if the attacker makes a query for g(oj) where u was 
used in a faulty signature. Game 5 is the same as Game4 since for a faulty signature 
m with random r and u, either the attacker starts with a h(m, r) query or it 
starts with a g(u) query. 

Pr[S 5 ] = Pr[S 4 ] 

Game 6 : we change the way the faulty signature oracle is simulated. Instead of 
first generating a and then a' as in equation ( 0 ), we first generate a uniformly 
random a' and then a random u modulo q such that a e mod N can be written 
as 0 [|u;[|s||t. Prom Lemma [I]wc have: 

|Pr[S 6 ]-Pr[S 5 ]|<g /s -^ 

Gamey: since we do not answer the queries for h(m,r) where r was used to 
generate a faulty signature, and the queries for g(u) where w was used in a 
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faulty signature, we do not need to compute ui. Therefore, we do not need to 
compute a random u modulo q such that a e mod N can be written as 0||w||s||t. 
Therefore we do not need to know the factorization of N anymore, and we have: 

Pr[S 7 ] = Pr[S 6 ] 

Finally, if the adversary outputs a forgery with probability at least s in Gameo, 
then the adversary must output a forgery with probability at least e — | Pr[SV] — 
Pr[So]| in Game 7 . As in the original PSS security proof, from this forgery we can 
solve the RSA challenge with probability at least: 

s' = e— \ Pr[SV] — PrfjS'o] | — 2 -fcl 

Combining the previous inequalities, we get (0. □ 

4 PSS-R Is Secure against Fault Attacks 

In PSS-R or PSS with message recovery the goal is to save bandwidth such that 
the message is recoverable from the signature; hence it is not necessary to send 
the message separately. 


4.1 The PSS-R Scheme 

We recall the definition of the PSS-R scheme j2|- The scheme uses three hash 
functions h : {0,1}* — > {0,l} fcl , g\ : {0, l} fcl — ► {0, l} fe ° and g 2 : {0, l} fcl — ► 
{0,l} fc_fco_fel_1 , where k, ko and k\ are the parameters. 

Key Generation: generate a fc-bit RSA modulus N = pq, and a random ex- 
ponent e e Z J(JV)‘ Generate d such that e ■ d = 1 mod The public-key is 

(AT, e); the private key is (AT, d). 

Signature generation: given a message m, do the following: 

1. r <— {0, l} fe ° 

2- w - h(M\\r) 

3. r* <— g\(u) © r 

4. m* g 2 {(jS) ® m 

5. y <— 0||(u||r*||m* 

6. Return a = y d mod N 

Message Recovery: given a signature a, do the following: 

1. Let y = cr e mod N 

2. Parse y as 0||u;||r*||m*. If the parsing fails return Reject. 

3. r^r*® 9l {u) 

4. m <- »n* ffi 

5. If h(m\\r) = w return m. 

6. else return Reject. 
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4.2 Security Proof 

Theorem 2. Assume that no algorithm can invert RSA in time if with probabil- 
ity better than e' . Then the signature scheme PSS-R[fco, ki] is ( t , qh, q g , q s ,Qfs, e) 
secure, where: 

t(k) = i!(k) — [&(&) + q g (k) + qh(k) + 1] • ko ■ S(k 3 ) 

e{k) = e'(k) + ( q s + qf s + 1) ■ ( q s + qf s + qh) -2 k ° + 8 • q g ■ qf s ■ 2 min ( fe i> fc / 2 ) 
"'+• (Qh + Qs + Qfs ) ■ ( Qh + Q g + Qs + Qfs + 1) ■ 2 kl 
+ q h -q fs -2- k °+4-q f s-2 - k / 2 

Here the attacker can make at most Qh , Q g ■ Q s , Qfs number of h queries, g queries, 
signature queries and fault signature queries respectively. 

Proof. The proof of this theorem is very similar to that of Theorem [Q and hence 
is omitted. 

5 Conclusion 

We obtain from the previous theorems that unless the attacker is making more 
fault oracle queries than hash oracle queries, one gets the same security bound 
as in the original PSS proof without fault oracle. We note that in practice fault 
queries are usually more expensive than hash queries, since those hash queries 
can be made offline when a concrete hash function is used. 

In 0 a better security bound was given for PSS (without fault oracle). It was 
shown that the random size ko could be taken as small as log 2 q s , where q s is 
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the maximum number of signature queries; with q s = 2 30 this gives ko = 30 bits. 
However with a fault oracle one cannot take such a small ko, since in this case 
the random r could be recovered by exhaustive search and the Bellcore attack 
would still apply. 

In summary, any parameters chosen according to the bounds in the original 
PSS paper j2| give the same level of security against fault attacks. One can take 
k = 1024, k 0 = h = 128 as in 0. 
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Abstract. Cache-timing attacks are a serious threat to security-critical 
software. We show that the combination of vector quantization and hid- 
den Markov model cryptanalysis is a powerful tool for automated analysis 
of cache-timing data; it can be used to recover critical algorithm state 
such as key material. We demonstrate its effectiveness by running an 
attack on the elliptic curve portion of OpenSSL (0.9.8k and under). This 
involves automated lattice attacks leading to key recovery within hours. 
We carry out the attack on live cache-timing data without simulating 
the side channel, showing these attacks are practical and realistic. 

Keywords: cache-timing attacks, side channel attacks, elliptic curve 
cryptography. 


1 Introduction 

Traditional cryptanalysis views cryptographic systems as mathematical abstrac- 
tions, which can be attacked using only the input and output data of the system. 
As opposed to attacks on the formal description of the system, side channel at- 
tacks m are based on information that is gained from the physical implemen- 
tation of the system. Side channel leakages might reveal information about the 
internal state of the system and can be used in conjunction with other crypt- 
analytic techniques to break the system. Side channel attacks can be based on 
information obtained from, for example, power consumption, timings, electro- 
magnetic radiation or even sound. Active attacks in which the attacker manip- 
ulates the operation of the system by physical means are also considered side 
channel attacks. 

Our focus is on cache-timing attacks in which side channel information is 
gained by measuring cache access times; these are trace-driven attacks |3j. We 
place importance on automated analysis for processing large volumes of cache- 
timing data over many executions of a given algorithm. Hidden Markov models 
(HMMs) provide a framework, where the relationship between side channel ob- 
servations and the internal states of the system can be naturally modeled. HMMs 
for side channel analysis was previously studied by Oswald j3j, and models for 
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key inference given by Karlof and Wagner jS] and Green et al. jS]. While their 
proposed models make use of an abstract side channel, we are concerned with 
concrete cache-timing data here. 

The analysis additionally makes use of Vector Quantization (VQ) for classi- 
fication. Cache-timing data is viewed as vectors that are matched to predefined 
templates, obtained by inducing the algorithm to perform in an unnatural man- 
ner. This can often easily be accomplished in software. 

Abstractly, it is reasonable to consider the analysis shown here as a form of 
template attack jjj used in power analysis of symmetric cryptographic primitive 
implementations, and more recently for asymmetric primitives 0. Chari et al. 0 
formalize exactly what a template is: A precise model for the noise and expected 
signal for all possible values for part of the key. Their attack is then carried out 
iteratively to recover successive parts of the key. 

It is difficult and not particularly prudent to model cache-timing attacks ac- 
cordingly. In lieu of such explicit formalization, we borrow from them in name 
and in spirit: The attacker has some device or code in their possession that they 
can give input to, program, or modify in some way that forces it to perform in 
a certain manner, while at the same time obtaining measurements from the side 
channel. 

Using the described analysis method, we carry out an attack on the elliptic 
curve portion of OpenSSL (0.9.8k). Within hours, we are able to recover the 
long-term private key in ECDSA by observing cache-timing data, signatures, and 
messages. Our attack exploits a weakness that stems from the use of a low-weight 
signed representation for scalars during point multiplication. The algorithm uses 
a precomputation table of points that are accessed during point addition steps. 
The lookups are reflected in the cache-timings, leaking critical algorithm state. 
A significant fraction of ECDSA nonce portions can be determined this way. 
Given enough such information, we are able to recover the private key using a 
lattice attack. 

The paper is structured as follows. In Sect. 0 we give background on cache 
architectures and various published cache attacks. In Sect. 01 we review elliptic 
curve cryptography and the implementation in OpenSSL. Section 0 covers VQ 
and how to apply it effectively to cache-timing data analysis. In Sect. 01 we 
discuss HMMs and describe how they are used in our attack, but also how they 
can be used to facilitate side channel attacks in general. We present our results 
in Sect. E| and countermeasures briefly in Sect. 0 We conclude in Sect. 01 

2 Cache Attacks 

We begin with a brief review of modern CPU cache architectures. This is followed 
by a selective literature review of cache attacks on cryptosystem implementations. 

2.1 Data Caches 

A CPU has a limited number of working registers to store data. Modern proces- 
sors are equipped with a data cache to offset the high latency of loading data 
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from main memory into these registers. When the CPU needs to access data, 
it first looks in the data cache, which is faster but with smaller capacity than 
main memory. If it finds the data in the cache, it is loaded with minimal latency 
and this is known as a cache hit; otherwise, a cache miss occurs and the latency 
is higher as the data is fetched from successive layers of caches or even main 
memory. Thus access to frequently used data has lower latency. Cache layers LI, 
L2, and L3 are commonplace, increasing with capacity and latency. We focus on 
data caches here, but processors often have an instruction cache as well. 

The cache replacement policy determines where data from main memory is 
stored in the cache. At opposite ends of the spectrum are a fully-associative cache 
and a direct mapped cache. Respectively, these allow data from a given memory 
location to be stored in any location or one location in the cache. The trade-off 
is between complexity and latency. A compromise is an N - way associative cache, 
where each location in memory can be stored in one of N different locations in 
the cache. The cache locations, or lines, then form a number of associative sets 
or congruency classes. 

We give the LI data cache details for the two example processors under con- 
sideration here. 

Intel Atom. The LI data cache consists of 384 lines of 64B each for a total of 
24KB. It is 6- way associative, thus the lines are divided into 64 associative 
sets. 

Intel Pentium 4. The LI data cache consists of 128 lines of 64B each for a total 
of 8KB. It is 4- way associative, thus the lines are divided into 32 associative 
sets. 

We focus on these because they implement Intel’s HyperThreading, a form of 
Simultaneous Multithreading (SMT) that allows active execution of multiple 
threads concurrently. In a cache-timing attack scenario, this relaxes the need to 
force context switches since the threads naturally compete for shared resources 
during execution, such as the data caches. The newly-released (Nov. 2008) Intel 
i7 also features HyperThreading; it has the same number of associative sets as 
the Intel Atom. 

2.2 Published Attacks 

Percival 0 demonstrated a cache-timing attack on OpenSSL 0.9.7c (30 Sep. 
2003) where a classical sliding window was used twice for exponentiation for two 
512-bit exponents in combination with the CRT to carry out a 1024-bit RSA 
encryption operation. Sliding window exponentiation computes (3 e by sliding 
a width-u window across e with placement such that the value falling in the 
window is odd. It then uses a precomputation table /3 l for all odd 1 < i < 2’", 
accessed during multiplication steps; this lookup is reflected in the cache-timings, 
demonstrated on a Pentium 4 with HyperThreading. The sequence of squarings 
and multiplications yields significant key data: recovery of 200 bits out of each 
512-bit exponent, and jO] claimed an additional 110 bits from each exponent due 
to fixed memory access patterns revealing information about the index to the 
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precomputation table and thus key data. Assuming the absence of errors, 0 
reasoned how this allows the RSA modulus to be factored in reasonable time. 
OpenSSL responded to the vulnerability in 0.9.7h (11 Oct. 2005) by modifying 
the exponentiation routine. 

Hlavac and Rosa [TO! used a similar approach to demonstrate a lattice attack 
on DSA signatures with known nonce portions. They estimated that after ob- 
serving 6 authentications to an OpenSSH server, which uses OpenSSL (< 0.9.7h) 
for DSA signatures, an attacker will have a high success probability when run- 
ning a lattice attack to recover the private key. They state that the side channel 
was emulated for the experiments. 

The numerous published attacks against secret key implementations are note- 
worthy. Among others, these include attacks on AES by Bernstein HU and Osvik 
et al. m Both papers present key recovery attacks on various implementations. 

3 Elliptic Curve Cryptography 

To demonstrate the effectiveness of the analysis method, we will look at one 
particular implementation of ECC. We stress that the scope of the analysis is 
much larger; this is merely one example of how it can be used. 

Given a point P on an elliptic curve and scalar k, scalar multiplication com- 
putes kP. This operation is the performance benchmark for an elliptic curve 
cryptosystem. It is normally carried out using a double- and- add approach, of 
which there are many varieties. We outline a common one later in this section. 

Our attack is demonstrated on an implementation of scalar multiplication used 
by ECDSA signature generation. A signature (r, s ) on a message m is produced 
using 


r = x(kG) mod n 
s = fc _1 (h(TO) + rd) mod n 


(1) 

(2) 


with point G of order n, nonce k chosen uniformly from [1, n), x(P) the projection 
of P to its ^-coordinate, h a collision-resistant hash function, and d the long-term 
private key corresponding to the public key D = dG. 

3.1 ECC in OpenSSL 

OpenSSL treats two cases of elliptic curves over binary and prime fields sepa- 
rately and implements scalar multiplication in two ways accordingly. We con- 
sider only the latter case, where a general multi-exponentiation algorithm is 
used f 1 311 4j . The algorithm works left-to-right and uses interleaving, where one 
accumulator is used for the result and point doublings are shared; low-weight 
signed representations are used for individual scalars. 

When only one scalar is passed, as in (HJ or when creating a signature using 
the OpenSSL command line tool, it reduces to a rather textbook version of scalar 
multiplication, in this case using the modified Non- Adjacent Form mNAF. u , (see, 
for example, uni). This is reflected in the pseudocode below. OpenSSL has the 


Cache-Timing Template Attacks 671 


ability to store the precomputed points in memory, so with a fixed P such as a 
generator they need not necessarily be recomputed for each invocation. 

The representation mNAF w is very similar to the regular windowed NAF,„. 
Each non-zero coefficient is followed by at least w — 1 zero coefficients, except 
for the most significant digit which is allowed to violate this condition in some 
cases to reduce the length of the representation by one while still retaining the 
same weight. Considering the MSBs of NAF W , one applies 10’ u,_1 <5 i— ► 010“’ _2 e 
where 5 < 0 and e = 2 W ~ 1 + S when possible to obtain mNAF,,.. 


Algorithm: Scalar Multiplication 
Input: k eZ, P € E (¥ p ), width w 

Output: kP 

(fcc-i . . . fco) <-mNAF w (k) 

Precompute iP for all odd 0 < i < 2 W ~ ] 
Q 4- k e -!P 

for i <— l — 2 to 0 do 

Q <— 2Q 

if ki 0 then Q <— Q + kiP 

return Q 


Algorithm: Modified NAF,„ 

Input: window width w, k € Z 

Output: rnNAF,„(fc) 

i^O 

while k > 1 do 

if k is odd then ki <— k mods 2"' . 

else ki <— 0 
k <— k/2, i <— i + 1 

if ki- i » l and ki-i- w < 0 then 
ki-i- w 4- ki-i- w + 2 1 "- 1 
ki- 1 4- 0, ki - 2 <— 1, i 4- i — I 

return (ki- 1, . . . , ko ) 


3.2 Cache Attack Vulnerability 

Following the description of the mNAF„ ; representation, knowledge of the curve 
operation sequence corresponds directly to the algorithm state, yielding quite a 
lot of key data. Point additions take place when a coefficient ki ^ 0 and these 
are necessarily followed by w point doublings due to the scalar representation. 
From the side channel perspective, consecutive doublings allow inference of zero 
coefficients, and more than w point doublings reveals non-trivial zero coefficients. 

Without any countermeasures, the above scalar multiplication routine is vul- 
nerable to cache-timing attacks. The points in the precomputation phase are 
stored in memory; when a point addition takes place, the point to be added is 
loaded into the cache. An attacker can detect this by concurrently running a spy 
process jS] that does nothing more than continually load its own data into the 
cache and measure the time require to read from all cache fines in a cache set, 
iterating the process for all cache sets. Fast cache access times indicate cache hits 
and the scalar multiplication routine has not aggressively accessed those cache 
locations since the last iteration, which would evict the spy process data from 
those cache locations, cause a cache miss, and thus slower cache access times for 
the spy process. 

In Fig. d we illustrate typical cache timing data obtained from a spy pro- 
cess running on a Pentium 4 (Top) and Atom (Bottom) with OpenSSL 0.9.8k 


672 B.B. Brumley and R.M. Hakala 


performing an ECDSA signature operation concurrently. The top eight rows of 
each graph are metadata; the lower half represents the VQ label and the upper 
half the algorithm state. We show how we obtained the metadata in Sect. Eland 
Sect. El respectively. The remaining cells are the actual cache-timing data. Each 
cell in these figures indicates a cache set access time. Technically, time moves 
within each individual cell, then from bottom to top through all cache sets, then 
from left to right repeating the measurements. To visualize the data, it is ben- 
eficial to consider the data as vectors with length equal to the number of cache 
sets, and time simply moves left to right. 

To manually analyze such traces and determine what operations are being per- 
formed we look for (dis)similarities between neighboring vectors. These graphs 
show seven (Top) and eight (Bottom) point additions, with repeated point dou- 
blings occurring between each addition. As an attacker, we hope to find correla- 
tion between these point additions and the cache access times — which we easily 
find here. Additions in the top graph are visible at rows 13 and 24, among others; 
the bottom graph, rows 6, 7, 55, 56. The reader is encouraged to use the vector 
quantization label to help locate the point additions (black label). 



Fig. 1 . Cache-timing data from a spy process running concurrently with an OpenSSL 
0.9.8k ECDSA signature operation; 160-bit curve, mNAF 4 . Top: Pentium 4 timing data, 
seven point additions. Bottom: Atom timing data, eight point additions. Repeated point 
doublings occur between the additions. The top eight rows are metadata; the bottom 
half the VQ label (Sect. El) and top half the HMM state (Sect. 0. All other cells are 
the raw timing data, viewed as column vectors from left to right with time. 
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4 Vector Quantization 

Automated analysis of cache-timing data like that shown in Fig. [I] is not a trivial 
task. When given just one trace, for simplistic algorithms it is sometimes possible 
to interpret the data manually. For many traces or complex algorithms this is 
not feasible. We aim to automate the process; the analysis begins with VQ. 

A vector quantizer is a map V : 7 Z n — > C with C C lZ n where the set C = 
{ci, . . . ,c a } is called the codebook. A typical definition is V : v i— > argmin ce c 
D(v, c) where D measures the n-dimensional Euclidean distance between v and 
c. One also associates a labelling L : C — > £ with the codebook vectors; this can 
be as trivial as C = {1, . . . , a} depending on the application. 

Here, we are particularly interested in YQ classification; input vectors are 
mapped to the closest vector in the codebook, then applied the correspond- 
ing label for that codebook entry. In this manner, input vectors with the same 
labelling share some user-defined quality and are grouped accordingly. The clas- 
sification quality depends on how well the codebook vectors approximate input 
data for their label. We elaborate on building the codebook C below. 

4.1 Learning Vector Quantization 

To learn the codebook vectors, we employ LVQ [EJ. This process begins with a 
set T = {(ti, h ), . . . , (tj , lj ) } of training vectors and predetermined correspond- 
ing labels, as well as an approximation to C. This is commonly derived by taking 
the k centroids resulting from fc-means clustering m on all ti sharing the same 
label. LVQ in its simplest form then proceeds as follows. For each fj,Ij 6 T if 
L{V(ti)) = li the classification is correct and the matching codebook vector is 
pulled closer to W, otherwise, incorrect and it is pushed away. This process is 
iterated until an acceptable error rate is achieved. 

4.2 Cache-Timing Data Templates 

We apply the above techniques to analyze cache-timing data. Taking the working 
example in Fig. |T| for the Pentium 4 we have n = 32 and Atom n = 64 the 
dimension of the cache-timing data vectors; this is the number of cache sets. 
For simplicity we define C = {D, A, E} to label vectors belonging to respective 
operations double, addition, or beginning/end. 

Next, we build the training data T. This is somewhat simplified for an attacker 
as they can create their own private key and generate signatures to produce 
training data. Nevertheless, extracting individual vectors by hand proves quite 
tedious and error-prone. Also, if the spy process executes multiple times, there 
is no guarantee where the memory buffer for the timing data will be allocated. 
From execution to execution, the vectors will likely look quite different. 

Inspired by template attacks (Zj , we instead modify the software in such a way 
that it performs only a single task we would like to distinguish. For the scalar 
multiplication routine shown in Sect. 0 we force the algorithm to perform only 
point doubling (addition) and collect templates to be used as training vectors 
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by running the modified algorithm concurrently with the cache spy process, 
obtaining the needed cache-timing data. This provides large amounts of training 
vectors and corresponding labels to define T with minimal effort. 

One might be tempted to use these vectors in their entirety for C. There are 
a few disadvantages in doing so: 

— This would cause VQ to run slower because #C would be sizable and contain 
many vectors such that T(cj) = L(cj) where D(c % . Cj) is needlessly small; 
codebook redundancy in a sense. In practice we may need to analyze copious 
volumes of trace data. 

- We cannot assume the obtained cache-timing data templates are completely 
error-free; we strive to curtail the effect of such erroneous vectors. 

To circumvent these issues, we partition T = U lecifaik) ■ k = 1} as subsets 
of all training vectors corresponding to a single label and subsequently perform 
fc-means clustering on the vectors in each subset. The resulting centroids are 
then added to C. Finally, with C and T realized we employ LVQ to refine C. 
This allows experimentation with different values for fc in fc-means to arrive at 
a suitably compact C with small vector classification error rate. 

While we expect quality results from VQ classification, errors will nevertheless 
occur. Furthermore, we are still left with the task of inferring algorithm state. 
To solve this problem, we turn to hidden Markov models. 

5 Hidden Markov Models 

HMMs (see, e.g., jTHj) are a common method for modeling discrete-time stochas- 
tic processes. An HMM is a statistical model in which the system being modeled 
is assumed to behave like a probabilistic finite state machine with directly unob- 
servable state. The only way of gaining information about the process is through 
the observations that are emitted from each state. 

HMMs have been successfully used in many real life applications; for example, 
many modern speech recognition methods are based on HMMs [HU ■ Their usabil- 
ity is based on the ability to model physical systems and gain information about 
the hidden causes of emitted observations. Thus, it is not very surprising that 
HMMs can be employed in side channel cryptanalysis as well: the target system 
can be viewed as the hidden part of the HMM and the emitted observations as 
information leaked through the side channel. In the following sections, we give 
a formal definition of an HMM, discuss the three basic problems for HMMs and 
describe how HMMs are used in our attack. The methodology should give an 
idea of how to use HMMs in side channel attacks in general. 

5.1 Elements of an HMM 

An HMM models a discrete-time stochastic process with a finite number of pos- 
sible states. The state of the process is assumed to be directly unobservable, but 
information about it can be gained from symbols that are emitted from each 
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state. The process changes its state based on a set of transition probabilities 
that indicate the probability of moving from one state to another. An observ- 
able symbol is emitted from each process state according to a set of emission 
probabilities. An example of an HMM is illustrated in Fig. |2I This HMM models 
a system with three internal states, which are denoted by circles in the figure. 
Denoted by squares are the two symbols, which can be emitted from the inter- 
nal states. The state transition probabilities and the emission probabilities are 
denoted by labeled arrows. For example, the probability of moving from state 
S2 to S3 is d23; the probability of emitting symbol v-2 from state S3 is 63(2). In 
this HMM, the process always starts from si. Generally, however, there may be 
several possible first states. The initial state distribution defines the probability 
distribution for the first state over the states of the HMM. 



Fig. 2. An example of an HMM 


Formally, an HMM is defined by the set of internal states, the set of observa- 
tion symbols, the transition probabilities between internal states, the emission 
probabilities for each observable, and the initial state distribution. We denote 
the set of internal states by S = {si,S2,...,sjv} and the state at time t by w t - 
Correspondingly, the set of observables is denoted by V = {t’i , v% , . . . , vm} and 
the observation emitted at time t by o t . The set of transition probabilities is 
denoted by A = {a^}, where 


= Pr(w t +i = Sj\w t = s^, 1 < i,j < N, 


such that YljLi a ij = 1 for all 1 < i < N. Whenever a,; ; - > 0, there is a direct 
transition from state .s,; to state Sj ; otherwise, it is not possible to reach Sj from s,; 
in a single step. An arrow in Fig.0denotes a positive transition probability. Thus, 
s 3 cannot be reached from si in a single step. The set of emission probabilities 
is denoted by B = {bj(k)}, where 


bj(k) = Pr(o 4 = Vk\w t = Sj), 1 < j < N, 1 < k < M. 
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The initial state distribution indicates the probability distribution for the first 
state w\. It is denoted by tt = {iri}, where 

7 Tj = Pr(wi = Si), 1 < i < N. 

The first state of the HMM in Fig. 0is always si, so the initial state distribution 
for this HMM is defined as 7ri = 1 and tt,; = 0 for alii ^ 1. The three probability 
measures A, B and 7r are called the model parameters. For convenience, we will 
simply write A = {A, B, tt) to indicate the complete parameter set of an HMM. 

5.2 The Three Basic Proble m s for HMMs 

The usefulness of HMMs is based on the ability to model relationships between 
internal states and observations. Related to this are the following three problems, 
which are commonly called the three basic problems for HMMs in literature (e.g., 

CHI): 

Problem 1. Given an observation sequence O = o\ 02 ■ ■ ■ <:>t and a model A = 
(A,B,ir), how do we efficiently compute Pr(OjA), the probability of the 
observation sequence given the model? 

Problem 2. Given an observation sequence O = 01 O 2 • • • ot and a model A, 
what is the most likely state sequence W = W 1 W 2 ■ ■ ■ wt that produced the 
observations? 

Problem 3. Given an observation sequence O = 01 O 2 • • • ot and a model A, how 
do we adjust the model parameters A = (A, B, n) to maximize Pr(0|A)? 

We briefly review the methods used to solve these problems; the reader can refer 
to HE! for a detailed overview. Problem 1 is sometimes called the evaluation 
problem since it is concerned with finding the probability of a given sequence O. 
This problem is solved by the forward-backward algorithm (see, e.g., CHI), which 
is able to efficiently compute the probability Pr(OjA). Problem 2 poses a problem 
that is very relevant to our work. It is the problem of finding the most likely 
explanation for the given observation sequence. The aim is to infer the most likely 
state sequence W that has produced the given observation sequence O. There 
are other possible optimality criteria CHI, but we are interested in finding W that 
maximizes Pr(W|0, A). The problem is known as the decoding problem and it is 
efficiently solved by the Viterbi algorithm m- Another relevant question is posed 
by Problem 3, which asks how to adjust the model parameters A = (A,B, n) 
to maximize the probability of the observation sequence O. Altough there is 
no known analytical method to adjust A such that Pr(0|A) is maximized, the 
Baum- Welch algorithm jE0| provides one method to locally maximize Pr(0| A). 
The process is often called training the HMM and it typically involves collecting 
a set of observation sequences from a real physical phenomenon, which are used 
in training. This problem is known as the learning problem. 

5.3 Use of HMMs in Side-Channel Attacks 

HMMs are also useful tools for side channel analysis 0. Karlof and Wagner HI 
and Green et al. |S] use HMMs for modeling side channel attacks. Their research 
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is concerned with slightly different problems than ours. We outline the differences 
below. 

— They only consider Problem 1 and simulate the side channel. As a result, 
Problem 3 is not relevant to their work since the artificial side channel ac- 
tually defines the model that produces the observations. Thus their model 
parameters are known a priori. This is not the case for our work; Problem 3 
is essential. 

— They assume one state transition per key digit, in which case the key can 
be inferred directly from the operation of the algorithm. In our case, the 
operation sequence does not reveal the entire key, but a significant fraction 
of the key nevertheless. We use an HMM in which the states correspond only 
to possible algorithm states. 

— They are additionally interested in derivation of the (secret) scalar k in 
scalar multiplication when the same scalar is used during several runs using 
a process called belief propagation. This is not helpful in our case, since 
(EC)DSA uses nonces. 

A practical drawback of the HMM presented by Karlof and Wagner was that a 
single observable needs to correspond to a single key digit (and internal state). 
Green et al. presented a model, where this is not required: multiple observables 
can be emitted from each state. This is a more realistic model as one system 
state may emit variable length data through the side channel. Our model allows 
this also, but it is based on a different approach. 

In the following sections, we describe the HMM used for modeling the 
OpenSSL scalar multiplication algorithm. We use this model in conjunction with 
YQ to describe the relationship between the states of the algorithm and the side 
channel observations. We also describe how to perform side channel data anal- 
ysis using VQ and the HMM. The aim is to find the most likely state sequence 
for each trace that is obtained from the side channel. The analysis process can 
be divided into two steps: 

1. The VQ codebook is created and the HMM parameters are adjusted accord- 
ing to obtained training sequences. 

2. The actual data analysis is performed. When a sequence of observations 
is obtained from the side channel, we infer the most likely (hidden) state 
sequence that has emitted these observations using VQ and the HMM. 

Since these states correspond to the internal states of the system, we are able to 
determine a good estimate of what operations have been done. This information 
allows us to recover the key. 

The following sections give a framework for performing side channel attacks 
on any system. The main requirements are that we know the specification of 
the system and have access to do experiments with it or are able to accurately 
model it. 

The HMM for Scalar Multiplication. We construct an HMM where the 
hidden part models the operation of the algorithm — in this case, scalar multipli- 
cation using the modified NAF„, representation, which leaks information about 
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the algorithm state through the side channel. An illustration of this part (with- 
out the transition probabilities) is presented in Fig. 0 The state set is defined 
as S = {si, . . . , sg}. Each label denotes the operation that is performed in the 
corresponding state. In addition, there are separate states to denote the system 
state preceding and following the execution of the algorithm. These states are 
denoted by si and sg, respectively. OpenSSL uses mNAF 4 for scalars in the case 
of the 160-bit curve order we are experimenting with, so each point addition is 
followed by at least 4 point doublings, except in the beginning or end of the pro- 
cess. The states S 3 , ... ,sg represent these doublings. The most significant digit 
is handled by the first addition state s 2 . 



Fig. 3. An HMM transition model for modified NAF4 scalar multiplication 


As can be seen from Fig.0 the execution of one point doubling or point addi- 
tion spans several column vectors in the trace. Hence, we should let the internal 
states emit multiple observations instead of just one. Green et al. 0 solved this 
problem by introducing an additional variable that counts the cumulative num- 
ber of emitted observables. This has the drawback of considerably expanding 
the state space. To avoid this, we solve the problem by introducing substates in 
each HMM state. One main state consists of a sequence of substates, which are 
just ordinary HMM states that always emit one observation. Thus, all previously 
introduced techniques can be used for our HMM. 

The set of observables for this HMM is V = {D. A, E}, which is the same 
set used for labeling cache-timing data vectors in Sect. 0 We assume that the 
additions emit mainly As and the doublings mainly Ds. The Si and Sg states 
are assumed to emit mainly Es. These symbols are connected with side chan- 
nel observations using VQ as described in Sect. 0 Each vector observation is 
labeled according to which state — A, D or E — they correspond to. When a 
new side channel observation is obtained, it can be classified as A, D or E by 
taking the label of the closest codebook vector. An example of this is shown 
in Fig. 0 where the rows directly above the observations represent the quan- 
tized values. Symbols A and D are indicated using darker and lighter shades, 
respectively. 
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Training of the HMM. Training starts by setting the initial model param- 
eters. These parameters can be rough estimates, since they will be improved 
during training. To train the model, we obtain a set of sequences in the HMM 
observation domain. These sequences can be created from the side channel obser- 
vations as we know how the algorithm operates. The obtained sequences are used 
for model parameter re-estimation, which is performed using the Baum- Welch 
algorithm f'/’Oj . Next, we create the codebook for VQ as shown in Sect. El 

Inference of the State Sequence. Given a set of side channel observation 
sequences from the real target system, we can infer the most likely hidden state 
sequence for each of them. The first step is to perform VQ, this is, to tag the 
observations with the label of the closest codebook vector. Thus, we get a set 
of sequences in the HMM observation domain. By applying the Viterbi algo- 
rithm [H3 , we finally obtain the most likely state sequence for each observation 
sequence. These state sequences are actually sequences of substates; the actual 
operation sequence can be recovered based on the transitions that are taken in 
each state sequence. An example of this is shown in Fig. QJ where the upper rows 
represent the main states of the algorithm. Additions are indicated using black; 
doublings are indicated using lighter shades. For example, the first addition on 
the top trace in Fig. □ is followed by five doublings. 

The state sequences obtained in this step can be used in conjunction with 
some other method to mount a key recovery attack. In the simplest case, the 
state sequence reveals the secret key directly and no other methods are needed. 
However, with mNAF 4 this is not the case; we discuss a few practical applications 
in the next section, as well as give our empirical results. 

6 Results 

Depending on the attack scenario and the number of traces available, there are 
at least two interesting ways to apply the analysis to the case of mNAF 4 and 
OpenSSL. The first assumes access to only a single or similarly small number of 
traces, while the second assumes access to a signature oracle and corresponding 
side channel information. 

Solving Discrete Logs. We consider special versions of the baby-step giant- 
step algorithm for searching restricted exponent spaces; see jZU Sect. 3.6] for a 
good overview. 

The length- £ rnNAF u; representation has maximum weight t/w and average 
weight If («j+ 1); we denote this weight as t. We assume that the analysis provides 
us with the position of non-zero coefficients, but not their explicit value or sign; 
thus each coefficient gives w — 1 bits of uncertainty. One can then construct a 
baby-step giant-step algorithm to solve the ECDLP in this restricted keyspace. 
The time and space complexity is 0( 2^ w ~ 1 ^ 3 ); note that this does not directly 
depend on i (or further, the group order n). For the curve under consideration, 
this gives a worst case of 0( 2 60 ) and on average 0( 2 48 ), whereas the complexity 
without any such side channel information is O(2 S0 ). 
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Lattice Attacks. Despite this reduced complexity, an attacker cannot trivially 
carry out the attack outlined above on a normal desktop PC. Known results on 
attacking signature schemes with partial knowledge of nonces include |22l23j ; the 
approach is a lattice attack. Formally, the attacker obtains tuples (?-*, s, , m,; , ki) 
consisting of a signature 0 , message, and partial knowledge of the nonce k obtained 
through the timing data analysis. For our experiments, not all such tuples are useful 
in the lattice attack. Using the formalization of we assume ki tells us 

h = z[ + 2 ai Zi + 2 0i z” 

with Zi the only unknown on the right. Our empirical timing data analysis results 
show that the majority of errors occur when too many or few doubles are placed 
between an addition; a synchronization error in a sense. So the farther we move 
towards the MSB, the more likely it is that we have erroneous indexing on ,0% 
and the lattice attack will likely fail. 

To mitigate this issue, we instead focus only on the LSBs. We disregard the 
upper term by setting z" = 0 and consider only tuples where ki indicates that 
z\ = 0 and a-,; > 6; that is, the LSBs of ki are 000000. For k chosen uniformly, 
this should happen with the reasonable probability of 2 -6 . Our empirical results 
are in line with those of |Z2!: For a 160-bit group order, 41 such tuples is usually 
enough for the lattice attack to succeed in this case. 

Lattice attacks have no recourse to compensate for errors. If our analysis 
determines z[ = 0 but indeed z[ ^ 0 for some i, that instance of the lattice 
attack will fail. We thus adopt the naive strategy of taking random samples of 
size 41 from the set of tuples until the attack succeeds; an attacker can always 
check the correctness of a guess by calculating the corresponding public key and 
comparing it to the actual public key. This strategy is only feasible if the ratio 
of error-free tuples to erroneous tuples is high. 

Finally, we present the automated lattice attack results; 8K signatures with 
messages and traces were obtained in both cases. 

Pentium 4 results. The analysis yielded 122 tuples indicating z[ = 0. The 
long-term private key d 0 was recovered after 1007 lattice attack iterations 
(107 correct, 15 incorrect). The analysis ran in less than an hour on a Core 
2 Duo. 

Atom results. The analysis yielded 147 tuples indicating z\ = 0. We recovered 
d after a total of 37196 lattice attack iterations (115 correct, 32 incorrect). 
Our analysis is less accurate in this case, but still accurate enough to recover 
the key in only a few hours on a Core 2 Duo. 

Summary. We omit strategies for finding correlation between the traces and 
specific key digits. This can be tremendously helpful in further reducing the 
search space when trying to solve the ECDLP. As such, given only one or a few 
traces, this analysis method should be used as a tool in conjunction with other 
heuristics to trim the search space. The lattice attack given here is proof-of- 
concept. The results suggest that significantly fewer signatures are needed. In 
practice one can perform a much more intelligent lattice attack, perhaps even 
considering lattice attacks that account for key digit reuse E3- 
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7 Countermeasures 

An implementation should not rely on any one countermeasure for side channel 
security, but rather a combination. We briefly discuss countermeasures, with an 
emphasis on preventing the specific weakness we exploited in OpenSSL. 

Scalar Blinding. One often-proposed strategy [1 12 5126177] is to blind the scalar 
k from the point multiplication routine using randomization. One form is (k+ 
mn + m)P — fhP with m, fh small (e.g. 32-bit) and random. The calculation 
is then carried out using multi-exponentiation with interleaving. With such 
a strategy, it suffices that m is low weight — not necessarily short. 
Randomized Algorithms. Use random addition-subtraction chains instead of 
highly regular double- and- add routines. Oswald PHI gave an example and 
a subsequent attack 0. Published algorithms tend to be geared towards 
hardware or resource restricted devices; see pni for a good review. In a 
software package like OpenSSL that normally runs on systems with abundant 
memory, one does not have to rely on simple randomized recoding and can 
build more flexible addition-subtraction chains. 

Shared Context. In OpenSSL’s ECC implementation, the results and illustra- 
tion in Fig. ^suggest what is most visible in the traces is not the lookup from 
the precomputation table, but the dynamic memory for variables in the point 
addition and doubling functions. OpenSSL is equipped with a shared con- 
text (301 PP- 106-107] responsible for allocating memory for curve and finite 
field arithmetic. Memory from this context should be served up randomly to 
prevent a clear fixed memory access pattern. 

Operation Balancing. In addition to the above shared context, coordinate 
systems and point addition formulae that are balanced in the number and 
order of operations are also useful; PH gives an example. 

The above countermeasures restrict to the software engineering view. Clearly op- 
erating system-level and hardware-level countermeasures are additionally possi- 
ble. We leave general countermeasures to this type of attack as an open question. 

8 Conclusion 

We summarize our contributions as follows: 

- We introduced a method for automated cache-timing data analysis, facilitat- 
ing discovery of critical algorithm state. This is the first work we are aware of 
that provides this at a framework level, e.g. not specific to one cryptosystem. 
Consequentially, it bridges the gap between cache attack vulnerabilities 0 
and attacks requiring partial key material f22123 . 

- We showed how to apply HMM cryptanalysis to cache-timing data; to the 
best of our knowledge, its first published application to real traces. This 
builds on existing work in the area of abstract side channel analysis using 
HMMs [4I5I6| , yet departs by tackling practical issues inherent to concrete 
side channels. 
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— We demonstrated the method is indeed practical by carrying out an attack 
on the elliptic curve portion of OpenSSL using live cache-timing data. The 
attack resulted in complete key recovery, with the analysis running in a 
matter of hours on a normal desktop PC. 

The method works by: 

1. Creating cache-timing data vector templates that reflect the algorithm’s 
cache access behavior. 

2. Using VQ to match incoming cache-timing data to these existing templates. 

3. Using the output as observation input to an HMM that accurately models 
the control flow of the algorithm. 

The setup phase, including acquiring the templates used to build the VQ code- 
book vectors and learning the HMM parameters, is the only part by definition 
requiring any manual work, and the majority of that can in fact be automated 
by simple modifications to the software under attack. This attack scenario is de- 
scribed for hardware power analysis in [Jj , but is perhaps even a greater practical 
threat in this case due to the inherent malleability of software. After the setup 
phase, cache-timing data analysis is fully automated and requires negligible time. 

The analysis given here is not strictly meant for attacking implementations, 
but for defending them as well. We encourage software developers to analyze 
their implementations using these methods to discover memory access patterns 
and apply appropriate countermeasures. 

Future Work 

One might think to forego the VQ step and use the cache-timing data directly as 
the sole input to the HMM. In our experience, this only complicates the model 
and hampers quality results. 

The example we gave was tailored to data caches, in particular the LI data 
cache. Other data caches could prove equally fruitful. We also plan to apply the 
analysis method to instruction caches. 

While the attack results we gave were for one particular cryptosystem im- 
plementation, the analysis method has a much wider range of applications. We 
in fact found a similar vulnerability in the NSS library’s implementation of el- 
liptic curves. Departing from elliptic curves and public key cryptography, we 
plan to apply the analysis to an assortment of implementations, asymmetric and 
symmetric primitives alike. 

One of the more interesting planned applications is to algorithms with good 
side channel resistance properties, such as “Montgomery’s ladder”. While this 
might be an overwhelming challenge for traditional power analysis, the work 
here emphasizes the fact that cache-timing attacks are about memory access 
patterns; a fixed sequence of binary operations cannot be assumed sufficient to 
thwart cache-timing attacks. 

Acknowledgments. We thank the following people for comments and discus- 
sions: Dan Bernstein, Kimmo Jarvinen, Kaisa Nyberg, and Dan Page. 
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Abstract. Physical attacks on cryptographic implementations and de- 
vices have become crucial. In this context a recent line of research on a 
new class of side-channel attacks, called memory attacks , has received in- 
creasingly more attention. These attacks allow an adversary to measure a 
significant fraction of secret key bits directly from memory, independent 
of any computational side-channels. 

Physically Unclonable Functions (PUFs) represent a promising new 
technology that allows to store secrets in a tamper-evident and unclon- 
able manner. PUFs enjoy their security from physical structures at sub- 
micron level and are very useful primitives to protect against memory 
attacks. 

In this paper we aim at making the first step towards combining and 
binding algorithmic properties of cryptographic schemes with physical 
structure of the underlying hardware by means of PUFs. We introduce a 
new cryptographic primitive based on PUFs, which we call PUF-PRFs. 
These primitives can be used as a source of randomness like pseudoran- 
dom functions (PRFs) . We construct a block cipher based on PUF-PRFs 
that allows simultaneous protection against algorithmic and physical at- 
tackers, in particular against memory attacks. While PUF-PRFs in gen- 
eral differ in some aspects from traditional PRFs, we show a concrete 
instantiation based on established SRAM technology that closes these 
gaps. 


1 Introduction 

Modern cryptography provides a variety of tools and methodologies to analyze 
and to prove the security of cryptographic schemes such as in j7!8Iil9j . These 
proofs always start from a particular setting with a well-defined adversary model 
and security notion. The vast majority of these proofs assume a black box model: 
the attacker knows all details of the used algorithms and protocols but has no 
knowledge of or access to the secrets of the participants, nor can he observe 
any internal computations. The idealized model allows one to derive security 
guarantees and gain valuable insights. 
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© International Association for Cryptologic Research 2009 


F. Armknecht et al. 


However, as soon as this basic assumption fails most security guarantees are 
off and a new open field of study arises. In cryptographic implementations long- 
term secret keys are typically stored by configuring a non-volatile memory such 
as ROM, EEPROM, flash, anti-fuses, poly or e-fuses into a particular state. 
Computations on these secrets are performed by driving electrical signals from 
one register to the next and transforming them using combinatorial circuits 
consisting of digital gates. Side-channel attacks pick up physically leaked key- 
dependent information from internal computations, e.g. by observing consumed 
power m or emitted radiation [T], making many straightforward algorithms and 
implementations insecure. It is clear that from an electronic hardware point of 
view, security is viewed differently, see e.g. j;il)l49l48Blj . 

Even when no computation is performed, stored secret bits may leak. For 
instance, in m it was shown that data can be recovered from flash memory 
even after a number of erasures. By decapsulating the chip and using scanning 
electron microscopes or transmission electron microscopes the states of anti-fuses 
and flash can be made visible. Similarly, a typical computer memory is not erased 
when its power is turned off giving rise to so-called cold-boot attacks . More 
radical approaches such as opening up an integrated circuit and probing metal 
wires or scanning non-volatile memories with advanced microscopes or lasers 
generally lead to a security breach of an algorithm, often immediately revealing 
an internally stored secret m- 

Given this observation, it becomes natural to investigate security models with 
the basic assumption: memory leaks information on the secret key. Consequently, 
a recently started line of work has investigated the use of new cryptographic 
primitives that are less vulnerable to leakage of key bits . These works 
establish security by adapting public-key algorithms to remain secure even after 
leaking a limited number of key bits. However, no security guarantees can be 
given when the leakage exceeds a certain threshold, e.g. when the whole non- 
volatile memory is compromised. Furthermore, they do not provide a solution 
for the traditional settings, e.g. for securing symmetric encryption schemes. 

Here we explore an alternative approach: Instead of making another attempt 
to solve the problem in an algorithmic manner, we base our solution on a new 
physical primitive. So-called Physically Unclonable Functions (PUFs) provide a 
new cryptographic primitive able to store secrets in a non-volatile but highly 
secure manner. When embedded into an integrated circuit, PUFs are able to use 
the deep submicron physical uniqueness of the device as a source of randomness 
1 1 511 4l‘2DI47j . Since this randomness stems from the uncontrollable subtleties of 
the manufacturing process, rather than from hard-wired bits, it is practically 
infeasible to externally measure these values during a physical attack. Moreover, 
any attempt to open up the PUF in order to observe its internals will with 
overwhelming probability alter these variables and change or even destroy the 

puf gzj . 

In this paper, we take advantage of the useful properties of PUFs to build 
an encryption scheme resilient against memory leakage adversaries as defined in 
0 . We construct a block cipher that explicitly makes use of the algorithmic and 
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physical properties of PUFs to protect against physical and algorithmic attacks 
at the same time. Other protection mechanisms against physical attacks require 
either additional algorithmic effort, e.g. |24l.44l45ffT3| . on the schemes or separate 
(possibly expensive) hardware measures. 

Our encryption scheme can particularly be used for applications such as se- 
cure storage of data on untrusted storage (e.g., harddisk) where (i) no storage 
of secrets for encryption/decryption is needed and keys are only re-generated 
when needed, (ii) copying the token is infeasible (unclonability), (iii) temporary 
unauthorized access to the token will reveal data to the adversary but not the 
key, or (iv) no non-volatile memory is available. 

Contribution. Our contributions are as follows: 

A new cryptographic primitive: PUF-PRF. We place the PUFs at the core of a 
pseudorandom function (PRF) construction that meets well-defined properties. 
We provide a formal model for this new primitive that we refer to as PUF- 
PRFs. PRFs Hg are fundamental primitives in cryptography and have many 
applications, e.g. see j 1 8l.42ld.4j . 

A PUF-PRF-based provably secure block cipher. One problem with our PUF- 
PRF construction is that it requires some additional helper data that inevitably 
leaks some internal information. Hence, PUF-PRFs cannot serve as a direct 
replacement for PRFs. However, we present a provably secure block cipher based 
on PUF-PRFs that remains secure despite the information leakage. Furthermore, 
no secret key needs to be stored, protecting the scheme against memory leakage 
attacks. The tight integration of PUF-PRFs into the cryptographic construction 
improves the tamper-resilience of the overall design. Any attempt at accessing 
the internals of the device will result in a change of the PUF-PRF. Hence, no 
costly error detection networks or alternative anti-tampering technologies are 
needed. The unclonability and tamper-resilience properties of the underlying 
PUFs allow for elegant and cost-effective solutions to specific applications such 
as software protection or device encryption. 

An improved and practical PUF-PRF construction. Although the information 
leakage through helper data is unavoidable in the general case, the concrete case 
might allow for more efficient and secure constructions. We introduce SRAM- 
PRFs, based on so-called SRAM PUFs, which are similar to the general PUF- 
PRFs but where it can be shown that no information is leaked through the 
helper data if run in an appropriate mode of operation. Hence, SRAM-PRFs are 
in all practical views a physical realization of expanding PRFs. 

Organization. This paper is organized as follows. First, we give an overview 
of related work in Section [21 In Section 01 we define and justify the considered 
attacker model. In Section EJ we introduce a formal model for PUFs. Based on 
this, we define in Section El a new cryptographic primitive, termed PUF-PRFs. 
Furthermore, we present a provably secure block cipher based on PUF-PRFs 
that is secure despite the information leakage through helper data. In Section El 
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we explain for the concrete case of SRAM PUFs an improved construction that 
shares the same benefits like general PUF-PRFs but where it can be argued that 
the helper data does not leak any information. Finally, in Section Q we present 
the conclusions. 

2 Related Work 

In recent years numerous results in the field of physical attacks emerged showing 
that the classical black box model is overly optimistic, see e.g. j-'12ll.'il.'il28l27l . 
Due to a number of physical leakage channels, the adversary often learns (part of) 
a stored secret or is able to observe some intermediate results of the private com- 
putations. These observations give him a powerful advantage that often breaks 
the security of the entire scheme. To cope with this reality, a number of new 
theoretic adversary models were proposed, incorporating possible physical leak- 
age of this kind. Ishai et al. m model an adversary which is able to probe, i.e. 
eavesdrop, a number of lines carrying intermediate results in a private circuit, 
and show how to create a secure primitive within this computational leakage 
model. Later, generalizations such as Physically Observable Cryptography pro- 
posed by Micali et al. m investigate the situation where only computation leaks 
information while assuming leak-proof secret storages. Recently, Pietrzak [1 MIMHj 
and Standaert et al. put forward some new models and constructions taking 
physical side-channel leakage into account. 

Complementary to the computation leakage attacks, another line of work 
explored memory leakage attacks: an adversary learns a fraction of a stored secret 
jZEEi. In I2j Akavia et al introduced a more realistic model that considers the 
security against a wide class of side-channel attacks when a function of the secret 
key bits is leaked. Akavia et al further showed that Regev’s lattice-based scheme 
@3 is resilient to key leakage. More recently Naor et al jSHl proposed a generic 
construction for a public-key encryption scheme that is resilient to key leakage. 
Although all these papers present strong results from a theoretical security point 
of view, they are often much too expensive to implement on an integrated circuit 
(IC), e.g. the size of private circuits in j3j blows up with 0(n 2 ) where n denotes 
the number of probings by the adversary. Moreover, almost all of these proposals 
make use of public-key crypto primitives, which introduce a significant overhead 
in systems where symmetric encryption is desired for improved efficiency. 

Besides the information leakage attacks mentioned above, another important 
field of studies are tampering attacks. Numerous countermeasures have been 
discussed, e.g., use of a protective coating layer gOj or the application of error 
detection codes (EDCs) [2511 6j . Observe that limitations and benefits of tamper- 
proof hardware have likewise been theoretically investigated in a series of works 


3 Memory Attacks 

In this work we consider an extension of memory attacks as introduced by Akavia 
et. al. |2j where the attacker can extract a bounded number of bits of a stored 
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secret. The model allows for covering a large variety of different memory attacks, 
e.g., cold boot attacks described in 1221 • However, this general model might not 
adequately capture certain concrete scenarios. For example, feature sizes on ICs 
have shrunk to nanometer levels and probing such fine metal wires is even for 
high-end IC manufacturers a difficult task. During a cryptographic computation 
a secret state is (temporarily) stored in volatile memory (e.g. in registers and 
flip-flops). In a typical IC, these structures are relatively small compared to the 
rest of the circuit, making them very hard to locate and scan properly. Thus, 
applying these attacks is usually significantly physically more involved for the 
case of embedded ICs than for the non-embedded PC setting where additional 
measures to access the memory exist, e.g., through software and networks. 

On the other hand, storing long-term secrets, such as private keys, requires 
non-volatile memory, i.e. memory that sustains its state while the embedding 
device is powered off. Implementation details of such memories like ROM, EEP- 
ROM, flash, anti-fuses, poly or e-fuses and recent results on physical attacks 
such as m indicate that physically attacking non-volatile memory is much eas- 
ier than attacking register files or probing internal busses on recent ICs, making 
non-volatile memory effectively the weak link in many security implementations. 

Motivated by these observations, we consider the following attacker model in 
this work: 

Definition 1 (Non-volatile Memory Attacker). Let o: : IN — ^ IN be a func- 
tion with a(n) < n for all n € I, and let S be a secret stored in non-volatile 
memory. A cc-non-volatile memory attacker can access an oracle O that takes 
as input adaptively chosen a polynomial-size circuits hi and outputs hi(S) under 
the condition that the total number of bits that A gets as a result of oracle queries 
is at most a(|<Sj). 

The attacker is called a full non-volatile memory attacker if a = id, that is 
the attacker can extract the whole content of the non-volatile memory. 

Obviously, protection against full non-volatile memory attackers is only possi- 
ble if no long-term secrets are stored within non-volatile memory. One obvious 
approach is to require a user password before each invocation. However, this 
reduces usability and is probably subject to password attacks. In this paper, 
we use another approach and make use of a physical primitive called Physi- 
cally Unclonable Function (PUF). PUFs allow to intrinsicly store permanent 
secrets which are, according to current state of knowledge, not accessible to a 
non-volatile attacker. 

4 Physically Unclonable Functions 

In this section, we introduce a formal model for Physically Unclonable Functions 
(PUFs). We start with some basic definitions. For a probability distribution D, 
the expression x <— B denotes the event that x has been sampled according to D. 
For a set S, x «— S means that x has been sampled uniformly random from S. For 
m > 1, we denote by U m the uniform distribution on {0, l} m . The min-entropy 
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H 00 ( B) of a distribution B is defined by H oa ( B) = f — log 2 (max x Pr[x <— B]). 
Min-entropy can be viewed as the “worst-case” entropy in a random variable 
sampled according to B nu and specifies how many nearly uniform random bits 
can be extracted from it. 

A distinguisher V is a (possibly probabilistic) algorithm that aims for distin- 
guishing between two different distributions B and B'. More precisely, T> receives 
some values (which may depend on adaptively chosen inputs by V) and outputs 
a value from {0,1}. The advantage of V is defined by Adv(D) = \Pr[l «— 
T>|B] — Pr[ 1 <— D|B']|. Furthermore, we define the advantage of distinguishing 
between B and B' as max® Adv(H). 

In a nutshell, PUFs are physical mechanisms that accept challenges and re- 
turn responses, that is behaving like functions. The main properties of PUFs 
that are important in the context of cryptographic applications are noise (same 
challenge can lead to different (but close) responses), non-uniform distribution 
(the distribution of the responses is usually non-uniform), independence (two dif- 
ferent PUFs show completely independent behavior), unclonability (no efficient 
process is known that allows for physically cloning PUFs), and tamper evidence 
(physically tampering with a PUF will most likely destroy its physical structure, 
making it unusable, or turn it into a new PUF). We want to emphasize that the 
properties above are of a physical nature and hence are very hard to prove in the 
rigorous mathematical sense. However, they are based on experiments conducted 
worldwide and reflect the current assumptions and observations regarding PUFs, 
e.g., see @Z|. We first provide a formal definition for noisy functions before we 
give a definition for PUFs. 

Definition 2 (Noisy functions). For three positive integers £,m,5 £ IN with 
0 < S < m, a (£,m, 6) -noisy function /* is a probabilistic algorithm which accepts 
inputs (challenges) x £ {0, 1}^ and generates outputs (responses) y £ {0,l} m 
such that the Hamming distance between two outputs to the same input is at 
most 6. In a similar manner, we define a (£, m, 5) -noisy family of functions to 
be a set of (£, m, 6)-noisy functions. 

Definitions (Physically Unclonable Functions). A (£,m,S;q pu f,e pu f)- 
family of PUFs V is a set of physical realizations of a family of probabilistic 
algorithms that fulfills the following algorithmic and physical properties. 

Algorithmic properties 

— Noise: V is a (£,m, 5) -noisy family of functions with 5 < y 

— Non-uniform output and independence: There exists a distribution B 
on {0, l} m such that for any input x £ {0, 1 } e , the following two distributions 
on ({0, i} m ) 9 * m -f can be distinguished with advantage at most e pu f. 

1. (ITi (a;), . . . , n qpuf (a;)) for adaptively chosen £V. 

2. (yi, . . . , y w ) with y, <- B. 

In order to have a practically useful PUF, it should be that q pu f w | V\, e pu j 
is negligible and Hoo( B) > 0. 
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Physical properties 

— Unclonability: No efficient technique is known to physically clone any 
member II £ V. 

— Tamper evidence: For any PUF TI e V . any attempt to externally obtain 
its responses or parameters, e.g. by means of a physical attack, will signifi- 
cantly alter its functionality or destroy it. 

A number of constructions for PUFs have been implemented and most of them 
have been experimentally verified to meet the properties of this theoretical def- 
inition. For more details we refer to the literature, e.g. (4712012915 1I4K( . One im- 
portant observation we make is that a number of PUF implementations can be 
efficiently implemented on an integrated circuit, e.g. SRAM PUFs (21 . Their 
challenge-response behavior can hence be easily integrated with a chip’s digital 
functionality. 

Remark 1. Due to their physical properties, PUFs became an interesting build- 
ing block for protecting against full non-volatile memory attackers. The basic 
idea is to use a PUF for implicitly storing a secret: instead of putting a secret 
directly into non-volatile memory, it is derived from the PUF responses during 
run time (201211 . 


5 Encrypting with PUFs: A Theoretical Construction 

In the previous section, we explained how to use PUFs for protecting any cryp- 
tographic scheme against full non-volatile memory attackers (see Remark QJ. In 
the remainder of the paper, we go one step further and explore how to use PUFs 
for protecting against algorithmic attackers in addition. For this purpose, we 
discuss how to use PUFs as a source of reproducible pseudorandomness. This 
approach is motivated by the observation that certain PUFs behave to some 
extent like unpredictable functions. This will allow for constructing (somewhat 
weaker) physical instantiations of (weak) pseudorandom functions. 

5.1 PUF-(w)PRFs 

Pseudorandom functions (PRFs) (El are important cryptographic primitives 
with various applications (see, e.g., (1 KIM 2155] h We recall their defininition. 

Definition 4 ((Weak) Pseudorandom Functions). Consider a family of 
functions T with input domain {0, l} e and output domain {0, l} m . We say that 
T is (q pr f,e pr f) -pseudorandom in respect to a distribution B on {0, l} m , if the 
advantage to distinguish between the following two distributions for adaptively 
chosen pairwise distinct inputs x \, . . . , x qprf is at most e vr f. 

— yi = f(xi) where f <— T 
~ Vi © 
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T is called weakly pseudorandom if the inputs are not chosen by the distin- 
guishes but uniformly random sampled from {0, 1}^ (still under the condition of 
being pairwise distinct). 

T is called ( q pr f,e pr f)-(weakly)-pseudorandom if it is ( q pr f,e pr f)- (weakly )- 
pseudorandom with respect to the uniform distribution B = U m . 

Remark 2. This definition differs in several aspects slightly from the original 
definition of pseudorandom functions, e.g., [5I4| . First, specifying the output 
distribution B allows for covering families of functions which have a non-uniform 
output distribution, e.g., PUFs. The original case, as stated in the definition, is 
B = U m . 

Second, the requirement of pairwise distinct inputs ay has been introduced to 
deal with noisy functions where the same input can lead to different outputs. By 
disallowing multiple queries on the same input, we do not need to model the noise 
distribution, which is sometimes hard to characterize in practice. Furthermore, 
in the case of non-noisy (weak) pseudorandom functions, an attacker gains no 
advantage by querying the same input more than once. Hence, the requirement 
does not limit the attacker in the non-noisy case. 

Observe that the “non-uniform output and independence” assumption on PUFs 
(as defined in Definition BJ) does not automatically imply (weak) pseudoran- 
domness. The first considers the unpredictability of the response to a specific 
challenge after making queries to several different PUFs while the latter consid- 
ers the unpredictability of the response to a challenge after making queries to 
the same PUF. 

Obviously, the main obstacle is to convert noisy non-uniform inputs into re- 
liably reproducible, uniformly distributed random strings. For this purpose, we 
make use of an established tool in cryptography, i.e. fuzzy extractors (FE) [12 : 

Definition 5 (Fuzzy Extractor). A (m, n, S; Pfe, £fe) Auzzy extractor E is 
a pair of randomized procedures, “generate” Gen : (0, l} m — > (0, l} 71 X {0, 1}* 
and “reproduce” Rep : {0, l} m X {0, 1}* -* {0, 1}”. 

The correctness property guarantees that for (z, w) <— Gen(y) and y' £ {0, l} m 
with dist(y,y') < S, then Rep(y',co ) = z. If dist(y,y') > S, then no guarantee is 
provided about the output of Rep. 

The security property guarantees that for any distribution B on {0, l} m of 
min-entropy pfe, the string z is nearly uniform even for those who observe ui: 
if ( z,u ) <— Gen(B), then it holds that SD((^,o;), (U„,u;)) < cfe- 

PUFs are most commonly used in combination with fuzzy extractor construc- 
tions based on error-correcting codes and universal hash functions. In this case, 
the helper data consists of a code-offset, which is of the same length as the PUF 
output, and the seed for the hash function, which is in the order of 100 bits and 
can often be reused for all outputs. 

Theorem 1 (Pseudorandomness of PUF-FE-composition). Let V be a 

(I, m, 6 ; q pu f, c pu f) -family of PUFs which are ( q pr f , e pr f)-pseudorandom with re- 
spect to some distribution B. Let E = (Gen, Rep) be an (m, n, <5; U^B), efe) 
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fuzzy extractor. The advantage of any distinguisher that adaptively chooses pair- 
wise distinct inputs x\, . . . ,x qprf and receives outputs ( 21 , aq), . . . , (z qprf ,Uq prf ) 
to distinguish the following two distributions is at most e pr f + q pr f ■ cfe'- 

- = Gen(II(xi)) where II A- V 

- where Zi <— U„, = Gen(II(xi)) and II A- V 

The analogous result holds ifV is (q pr f, e pr f) -weak-pseudorandom and if the chal- 
lenges Xi are sampled uniformly random (instead of being adaptively selected), 
still under the condition of being pairwise distinct. 

Proof. We introduce an intermediate case, named case 1’, where ( 2 $, a?f) = 
Gen(yj) with yi <— B and II «— V. Any distinguisher between case 1 and case 
1’ can be turned into a distinguisher that distinguishes between PUF outputs 
and random samples according to B. Hence, the advantage is at most e pr f by 
assumption. Furthermore, by the usual hybrid argument and the security prop- 
erty of fuzzy extractors, case 1’ and case 2 can be distinguished with advantage 
of at most q pr f ■ cfe- □ 

Definition 6 (PUF-(w)PRFs). Consider a family of (weakly) -pseudorandom 
PUFs V and a fuzzy extractor E = (Gen, Rep ) (where the parameters are as de- 
scribed in Theorem EJ). A family of PUF-(w)PRFs is a set of pairs of randomized 
procedures, called generation and reproduction. The generation function Gen o II 
for some PUF II G V takes as input x G {0, 1 Y outputs (z,lj x ) = f Gen(II(x)) G 
{0, 1}" x {0, 1}*, while the reproduction function Repo 77 takes (x, w x ) G {0, l}^x G 
{0, 1}* as input and reproduces the value z = Rep(n(x),u> x ). 

Theorem □ actually shows that PUF-(w)PRFs and “traditional” (w)PRFs have 
in common that (part of) the output cannot be distinguished from uniformly 
random values. One might be tempted to plug in PUF-(w)PRFs wherever PRFs 
are required. Unfortunately, things are not that simple since the information 
saved in the helper data is also needed for correct execution. It is a known fact 
that the helper data of a fuzzy extractor always leaks some information about 
the input, e.g., see m ■ Hence, extra attention must be paid when deploying 
PUF-PRFs in cryptographic schemes. In the following section, we describe an 
encryption scheme that achieves real-or-random security although the helper 
data is made public. 

5.2 A Luby-Rackoff Cipher Based on PUF-wPRFs 

A straightforward approach for using PUF-wPRFs against full non-volatile mem- 
ory attackers would be to use them for key derivation where the key is after- 
wards used in some encryption scheme. However, in this construction PUF- 
wPRFs would ensure security against non-volatile memory attackers only while 
the security of the encryption scheme would need to be shown separately. In the 
following, we present a construction that simultaneously protects against algo- 
rithmic and physical attacks while the security in both cases can be deduced to 
PUF-wPRF properties. 
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Plaintext 



Fig. 1. A randomized 3-round Luby-Rackoff-cipher based on PUF-PRFs 


One of the most important results with respect to PRFs was developed by 
Luby and Rackoff in E3J. They showed how to construct pseudorandom permu- 
tations from PRFs. Briefly summarized, a pseudorandom permutation (PRP) is 
a PRF that is a permutation as well. PRPs can be seen as an idealization of 
block ciphers. Consequently, the Luby-Rackoff construction is often termed as 
Luby-Rackoff cipher. 

Unfortunately, the Luby-Rackoff result does not automatically apply to the 
case of PUF-PRFs. As explained in the previous section, PUF-(w)PRFs differ 
from (w)PRFs as they additionally need some helper data for correct execution. 
First, it is unclear if and how the existence and necessity of helper data would 
fit into the established concept of PRPs. Second, an attacker might adaptively 
choose plaintexts to force internal collisions and use the information leakage of 
the helper data for checking for these events. 

Nonetheless, we can show that a Luby-Rackoff cipher based on PUF-wPRFs 
also yields a secure block cipher. For this purpose, we consider the set of concrete 
security notions for symmetric encryption schemes that has been presented and 
discussed in Q]. More precisely, we prove that a randomized version of a 3-round 
Luby-Rackoff cipher based on PUF-PRFs fulfills real-or-random indistinguisha- 
bility against a chosen-plaintext attacker. 

In a nutshell, a real-or-random attacker adaptively chooses plaintexts and 
hands them to an encryption oracle. This oracle either encrypts the received 
plaintexts (real case) or some random plaintexts (random case). The encryptions 
are given back to the attacker. Her task is to distinguish between both cases. The 
scheme is real-or-random indistinguishable if the advantage of winning the game 
is negligible (in some security parameter). Next, we first define the considered 
block cipher and prove its security afterwards. 

Definition 7 (3-round PUF-wPRF- based Luby-Rackoff cipher). Let T 

denote a family of PUF-wPRFs with input and output length n. The 3-round 
PUF-PRF-based Luby-Rackoff cipher E T uses three different PUF-wPRFs /) € 
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T, i = 1,2,3, as round functions. The working principle is very similar to the 
original Luby-Rackoff cipher and is displayed in figure 0 The main differences 
are twofold. First, at the beginning some uniformly random value p e {0, 1}^ is 
chosen to randomize the right part R of the plaintext. Second, the round functions 
are PUF-wPRFs that generate two outputs: Zi and u>i. 

The ciphertext is (X,Y,mi,u) 2 ,u> 3 , p). Decryption works similar to the case of 
the ’’traditional” Luby-Rackoff cipher where the helper data u>i is used together 
with the Rep procedure for reconstructing the output of the PUF-PRF fi and 
the value p to ” derandomize” the input to the first round function ff. 

Since there is no digital secret stored in non-volatile memory, even a full non- 
volatile memory attacker has no advantage in breaking this scheme. Although 
this makes encrypting digital communication between two different parties im- 
possible, various applications are imaginable, e.g., for encrypting data stored in 
untrusted or public storage. 

Theorem 2. Let £ T be the encryption scheme defined in Definition [7| using a 
family T of PUF-wPRFs (with parameters as specified in Theorem 01- Then, 
the advantage of a real-or-random attacker making up to q pr f queries is at most 
4 e pr f + 2q pr f ■ cfe + 2 • . 

Proof. Let {(L^, R^}i=i,..., qprf denote the sequence of the adaptively chosen 
plaintexts and x ^ , z'p be the respective inputs and outputs to round function fj, 
and /}W the randomly chosen values. We show the claim by defining a sequence 
of games and estimating the advantages of distinguishing between them. Let the 
real game be the scenario that the distinguisher receives the encryptions of the 
plaintext she did choose. 

In game 1, the outputs z® of the first round function /i are replaced by some 
uniformly random values z^ A {0, 1}". Under the assumption that the values 
x^ are pairwise distinct, the advantage to distinguish between both cases is at 
most e pr f + q pr f ■ cfe according to Theorem 0 Furthermore, as the values pW 
are uniformly random, the probability of a collision in the values x^ is at most 
As a consequence, the advantage to distinguish between the real game and 
game 1 is upper bounded by e pr f + q pr f ■ cfe + %pA- 

Game 2 is defined like game 1 where now the inputs x^ to the first round 
function f\ are randomized to x{^ A {0, 1}". Observe that the values are 
used in two different contexts: i) for computing the right part of the ciphertext 
(by XORing with the output of the second round function) and ii) as input to 
the first round function. Regarding i), observe that the outputs of the second 
round function are independent of the values as the values z[^ (and hence 
the inputs to ,/y are uniformly random by definition and that the values x^ are 
independent of the plaintext (because of p w). Hence, i) and ii) represent two 
independent features, possibly allowing for distinguishing between game 1 and 
game 2, and hence can be examined separately. 
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The advantage of distinguishing between games 1 and 2 based on i) is equiv- 
alent to deciding whether the values 7?0) ® pW ® yW are uniformly random or 
belong to the outputs of the second round function. With the same arguments 
as above, the advantage is upper bounded by e pr f + q pr f ■ cfe + qp fl ■ 

The advantage of distinguishing between game 1 and game 2 based on ii) is at 
most the advantage of distinguishing (77i(a;^), . . . , 77} (a;^/)) from (iTx(i^), 
. . II\(xqp rt )) where 77i denotes the PUF used in f\. By the definition of wPRFs 
(Definition^, the advantage of distinguishing . . . , Hi(x^ rf )) from 

(yi, , y qprf ) where jp <— B and D being an appropriate distribution is at most 
eprf. Actually, the same holds for (77i(x l j{' 1 ), . . . , Il\(xqj rf )) (the fact that the 
values Xj are unknown cannot increase the advantage). Hence, by the triangular 
inequality, it follows that the advantage regarding ii) is at most 2 e pr f. In total, 
the advantage to distinguish between game 1 and game 2 is less than or equal 
to 3 e pr f + q pr f ■ eFE H • 

Finally, observe that it is indistinguishable whether or is randomized 
and likewise whether or 7,0). Hence, game 2 is indistinguishable from the 
random game where the plaintexts are randomized. Summing up, the advantage 
of a real-or-random attacker is at most 4e pr / + 2 q pr f ■ efe + 2 • g • □ 

6 SRAM PRFs 

In the previous section, we showed that secure cryptographic schemes are pos- 
sible even if helper data is used that leaks information. In this section, we 
show that in the concrete case, information leakage through helper data can 
be avoided completely. We illustrate this approach on SRAM PUFs that were 
originally introduced and experimentally verified in EH- In respect to our mod- 
eling, an SRAM PUF is a realization of a (£, m, S-, q pu f ■ e p „/)-PUF that is (2 f \ 0 )- 
pseudorandom. 

We introduce a new mode of operation that, similarly to the fuzzy extractor 
approach in the previous section, allows for extracting uniform values from SRAM 
PUFs in a reproducible way. This approach likewise stores some additional helper 
data but, as opposed to the case of fuzzy extractors, the helper data does not leak 
any information on the input. Hence, this construction might be of independent 
interest for SRAM PUF based applications. The proposed construction is based 
on two techniques: Temporal Majority Voting and Excluding Dark Bits. 

We denote the individual bits of a PUF response as y = ( yo , . . . , y m -i), with 
yi € {0,1}. When performing a response measurement on a PUF 77, every bit 
2/i of the response is determined by a Bernoulli trial. Every y,; has a most likely 
value y- ML) G {0, 1}, and a certain probability Pi < 1/2 of differing from this 
value which we define as its bit error probability. We denote y^ as the fc-th 
measurement or sample of bit j/j in a number of repeated measurements. 
Definition 8 (Temporal Majority Voting (TMV)). Consider a Bernoulli 
distributed random bit yi over {0,1}. We define temporal majority voting of yi 
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over N votes, with N an odd positive integer, as a function TMVn : {0, 1}^ — > 
{0,1}, that takes as input N different samples of pi and outputs the most often 
occurring value in these samples. 

We can calculate the error probability p^,i of bit y t after TMV with N votes as: 

PN,i = f Pr JtMVjv (y- 0) , • • • , Vi N ~ 1] ) + y\ ML) \ = 1 - Binjy^ y— ^ < Pi» 

(!) 

with Bin ;v. Pi the cumulative distribution function of the binomial distribution. 
From Eq. CD it follows that applying TMV to a bit of a PUF response effectively 
reduces the error probability from pi to PN,i, with pM,i becoming smaller as N 
increases. We can determine the number of votes N we need to reach a certain 
threshold Pt such that PN,i < Pt , given an initial error probability Pi . It turns 
out that N rises exponentially as Pi gets close to 1/2. In practice, we also have to 
put a limit Afy on the number of votes we can perform, since each vote involves 
a PUF response measurement. We call the pair ( N t ,Pt ) a TMV-threshold. 

Definition 9 (Dark Bit (DB)). Let ( N t ,Pt ) he a TMV-threshold. We define 
a bit yi to he dark with respect to this threshold if Pn t ,i > Pt- 

TMV alone cannot decrease the bit error probability to acceptable levels (e.g. 
< 10 -9 ) because of the non-negligible occurrence of dark bits. We use a bit mask 
7 to identify these dark bits in the generation phase, and exclude them during 
reproduction. Similar to fuzzy extractors, (Ay, pr)-TMV and DB can be used 
for generating and reproducing uniform values from SRAM PUFs. 

The Gen-procedure takes sufficient measurements of every response bit pi 
to make an accurate estimate of its most likely value p^ ML ^ and of its error 
probability Pi. If y, is dark with respect to ( Nt,Pt ), then the corresponding bit 
7i in the bit mask 7 G {0, l} m is set to 0 and y, is discarded, otherwise 7, is 
set to 1 and pi is appended to the bit string s. The procedure Gen outputs a 
helper string u = (7, cr) and an extracted string z = Extracts (s), with Extract^ 
a classical strong extractoiQ with seed a. 

The Rep-procedure takes Afy measurements of a response p' and the corre- 
sponding helper string w = (7, cr), with 7 G {0, l} m as input. If 7,; contains a 1, 
then the result of TMVjv t (y'i°\ ■ ■ ■ > _1 ^ is appended to a bit string s', 

otherwise, y' is discarded. Rep outputs an extracted string zJ = Extract cr (s / ). 

A strong extractor is a function that is able to generate nearly-uniform 
outputs from inputs coming from a distribution with limited min-entropy. It 
ensures that the statistical distance of the extracted output to the uniform dis- 
tribution is negligible. The required compression rate of Extracts depends on 
the remaining min-entropy p of the PUF response y after the helper data is 
observed. We call the above construction a TMV-DB-SRAM-PUF. 

1 See e.g. I.'-!7I1 2\ for a definition of a strong extractor. Typical seed lengths of strong 
extractors are in the order of 100 bits, and in most cases the same seed can be reused 
for all outputs. 
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Using analogous arguments as in Theorem Q] one can show that the output of a 
TMV-DB-SRAM-PUF is indistinguishable from random except with negligible 
advantage. Additionally, in an SRAM PUF, the most likely value of a bit is 
independent of whether or not the bit is a dark bit, hence no min-entropy on 
the PUF output is leaked by the bit mask0. However, by searching for matching 
helper strings, an adversary might still be able to find colliding TMV-DB-SRAM- 
PUF inputs (especially as the input size is small), which can impose a possible 
security leak. In order to overcome this issue, we present the following way of 
using a TMV-DB-SRAM-PUF: 

Definition 10 (All-at-once mode). Consider a TMV-DB-SRAM-PUF as de- 
scribed above. We define the all-at-once mode of operation to be the pair of pro- 
cedures ( Enroll , Eva!). 

The enrollment procedure Enroll outputs a helper table fi e {0, l} 2 x * when 
executed. The helper table is constructed by running Vx e {0, 1 Y the generation 
function ( GenoII)(x ), and storing the obtained helper data uj x as the x-th element 
in fi, i.e. fi[x\ := u x . 

The evaluation function Eval : {0, 1 } e x {0, l} 2 x * — > {0, 1}" takes an element 
x G {0, l} e and a helper table f2 e {0, l} 2 x * as inputs and (after internal com- 
putation) outputs a value Eval(x, f2) = z G {0, 1}", with z = (Repo II){x, fi[x\). 

The En roll-procedure has to be executed before the Eval-procedure, but it has 
to be run only once for every PUF. Every invocation of Eval can take the same 
(public) helper table FI as one of its inputs. However, in order to conceal exactly 
which helper string is used, it is important that the Eval-procedure takes fi as a 
whole as input, and does not just do a look-up of fi[x] in a public table fi. The 
all-at-once mode prevents an adversary from learning which particular helper 
string is used during the internal computation. 

Definition 11 (SRAM-PRF). An SRAM-PRF is a TMV-DB-SRAM-PUF 
that runs in the all-at-once mode. 

Using the arguments given above we argue that SRAM-PRFs are in all prac- 
tical views a physical realization of PRFs. Observe that one major drawback 
of SRAM-PRFs is that the hardware size grows exponentially with the input 
length. Thus, SRAM-PRFs cannot be used as a concrete instantiation of PUF- 
PRFs for our construction from Section 15.21 This section rather shows up an 
alternative approach for constructing cryptographic mechanisms based on PUFs 
despite of the noise problem. As a possible application of SRAM-PRFs, we dis- 
cuss an expanding Luby-Rackoff cipher where the round functions are replaced 
by SRAM-PRFs that take 8-bit challenges as input and produce 120-bit ex- 
tracted outputs. According to j3B| • at least 48 rounds are necessary for security 
reasons. 

As an instantiation for the PUF, we take an SRAM PUF with an assumed 
average bit error probability of 15% and an estimated min-entropy content of 
0.95 bit/cell. We use TMV-threshold of ( Nt = 99 , pr = 10 -9 ). Simulations and 

2 By consequence, also no min-entropy on the PUF input is leaked. 
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experiments on the SRAM PUF show that about 30% of the SRAM cells produce 
a dark bit with respect to this TMV-threshold. The strong extractor only has to 
compress by a factor of accounting for the limited min-entropy in the PUF 
response. Hence, Aj • 2 70 \'y < ° bits = 5.6 kbyte of SRAM cells is needed to build 
one SRAM-PRF. Thus, the entire block cipher uses 48-5.6 kbyte fa 271 kbyte of 
SRAM cells. The helper tables also require 5.6 kbyte each. 

Implementing 48 SRAM PUFs using a total of 271 kbyte of SRAM cells is fea- 
sible on recent ICs, and 48 rounds can be evaluated relatively fast. Storing and 
loading 48 helper tables of 5.6 kbyte each is also achievable in practice. Observe 
that the size depends linearly on the number of rounds. The according parameters 
for more rounds can be easily derived. Reducing the input size of the SRAM-PRF 
will yield an even smaller amount of needed SRAM cells and smaller helper tables, 
but the number of rounds will increase. A time-area trade-off is hence possible. 

7 Conclusions 

In this paper we propose a leakage-resilient encryption scheme that makes use 
of Physically Unclonable Functions (PUFs). The core component is a new PUF- 
based cryptographic primitive, termed PUF-PRF, that is similar to a pseudo- 
random function (PRF). We showed that PUF-PRFs possess cryptographically 
useful algorithmic and physical properties that come from the random character 
of their physical structures. 

Of course, any physical model can only approximately describe real life. Al- 
though experiments support our model for the considered PUF implementations, 
more analysis is necessary. In this context it would be interesting to consider 
other types of PUFs which fit into our model or might be used for other crypto- 
graphic applications. Furthermore, a natural continuation of this works would be 
to explore other cryptographic schemes based of PUF-PRFs, e.g., hash functions 
or public key encryption. 
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Abstract. A leakage-resilient cryptosystem remains secure even if arbi- 
trary, but bounded, information about the secret key (and possibly other 
internal state information) is leaked to an adversary. Denote the length 
of the secret key by n. We show: 

— A full-fledged signature scheme tolerating leakage of n — n e bits of 
information about the secret key (for any constant e > 0), based on 
general assumptions. 

— A one-time signature scheme, based on the minimal assumption of 
one-way functions, tolerating leakage of (~ — e) • n bits of information 
about the signer’s entire state. 

— A more efficient one-time signature scheme, that can be based on 
several specific assumptions, tolerating leakage of (| — e) ■ n bits of 
information about the signer’s entire state. 

The latter two constructions extend to give leakage-resilient t-time sig- 
nature schemes. All the above constructions are in the standard model. 


1 Introduction 

Proofs of security for cryptographic primitives traditionally treat the primitive 
as a “black box” that an adversary is able to access in a relatively limited fash- 
ion. For example, in the usual model for proving security of signature schemes, 
an adversary is given the public key and allowed to request signatures on any 
messages of its choice, but is unable to get any other information about the se- 
cret key or any internal randomness or state information used during signature 
generation. 

In real-world implementations of cryptographic primitives, on the other hand, 
an adversary may be able to recover a significant amount of additional informa- 
tion not captured by standard security models. Examples include information 
leaked by side-channel cryptanalysis f‘201‘2 1 j . fault attacks fhl.'ij . or timing at- 
tacks , or even bits of the secret key itself in case this key is improperly stored 
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or erased \H\ Potentially, schemes can also be attacked when they are imple- 
mented using poor random number generation j2H| (which can be viewed as 
giving the adversary additional information on the internal state, beyond what 
would be available if the output were truly random), or when the same key is 
used in multiple contexts (e.g., for decryption and signing). 

In the past few years, cryptographers have made tremendous progress to- 
ward modeling security in the face of such information leakage [25IM5[ . and in 
constructing leakage-resilient cryptosystems secure even in case such leakage oc- 
curs. (There has also been corresponding work on reducing unwanted leakage 
by, e.g., building tamper-proof hardware; this is not the focus of our work.) 
Most relevant to the current work is a recent series of results jllllld 119111)12 fi!2j 
showing cryptosystems that guarantee security even when arbitrary informa- 
tion about the secret key is leaked (under suitable restrictions); we discuss this 
work, along with other related results, in further detail below. This prior work 
gives constructions of stream ciphers [1 1 l.‘d 1 j (and hence stateful symmetric-key 
encryption and MACs), symmetric-key encryption schemes 0, public- key en- 
cryption schemes [Illl)l2(ij . and signature schemes 0 achieving various notions 
of leakage resilience. 

Most prior work has focused on primitives for ensuring secrecy. The only work 
of which we are aware that deals with authenticity is that of Alwen et al. 0 which 
shows, among other results, leakage-resilient signature schemes based on number- 
theoretic assumptions in the random oracle modelQ Here we give constructions of 
leakage-resilient signature schemes based on general assumptions in the standard 
model ; our main construction also tolerates more leakage than the schemes of 0. 
(In the full version we also show some technical improvements to the results 
of 0 .) We postpone a more thorough discussion of our results until after we 
define leakage resilience in more detail. 


1.1 Modeling Leakage Resilience 

At a high level, definitions of leakage resilience take the following form: Begin 
with a “standard” security notion (e.g., existential unforgeability under adaptive 
chosen message attacks m) and modify this definition by allowing the adver- 
sary to (adaptively) specify a series of leakage functions fi , — The adversary, 
in addition to getting whatever other information is specified by the original 
security definition, is given the result of applying /) to the secret key and pos- 
sibly other internal state of the honest party (e.g., the signer). We then require 
that the adversary’s success probability — for signature schemes, the probability 
with which it can output a forged signature — remain negligible. It should be 
clear that this is a general methodology that can be applied to many different 
primitives. The exact model is then determined by the restrictions placed on the 
leakage function(s) /,: 

Limited vs. arbitrary information. A first consideration regards whether 
the {fi} can be arbitrary (polynomial-time computable) functions, or whether 

1 The results of 0 were obtained independently of our own work. 
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they are restricted to be in some more limited class. Early work considered 
the latter case, for example where the adversary is restricted to learning spe- 
cific bits of the secret key |5|, or the values on specific wires of the circuit 
implementing the primitive [HU- More recent work jl 1 1 1 Id 1 101 1 1)12012] allows 
arbitrary {/*}. 

Bounded vs. unbounded information leakage. Let n denote the length of 
the secret key. If the secret key does not change over time, and the {/,;} are 
allowed to be arbitrary, then security in the traditional sense cannot be achieved 
once the total length of the leakage — that is, the outputs of all the {/*} — is 
n bits or more. For the case of signatures, the length of the leakage must also 
be less than the signature length. This inherent restriction is used in jl II flKfil . 
(Alwen et al. j2j do not impose this restriction, but as a consequence can only 
achieve a weaker notion of security.) 

One can avoid this restriction, and potentially tolerate an unbounded amount 
of leakage overall, if the secret key is updated over time; even in this case, one 
must somehow limit the amount of leakage between successive key updates. This 
approach to leakage resilience was considered in jl lid 1 j in the context of stateful 
symmetric-key primitives, and m in the context of stateful signature schemes. 

One can also avoid imposing a bound on the leakage by restricting the {/*}, 
as discussed next. 

Computational min-entropy of the secret key. If the leakage is much 
shorter than the secret key (as discussed above), then the secret key will have 
high min-entropy conditioned on the leakage. This setting is considered in 
j 112(111 ()EI . and is also enforced on a per-period basis in the work of [1 1IM1 j 
(i.e., the leakage per time period is required to be shorter than the secret key). 
More recent work jUD! shows schemes that remain secure for leakage of arbi- 
trary length, as long as the secret key remains exponentially hard to compute 
given the leakage (but even if the secret key is fully determined by the leakage 
in an information-theoretic sense). A drawback of this guarantee is that given 
some collection of functions {/j} (say, as determined experimentally for some 
particular set of side-channel attacks) there is no way to tell, in general, whether 
they satisfy the stated requirement or not. Furthermore, existing results in this 
direction currently require super-polynomial hardness assumptions. 

Inputs to the leakage functions. A final issue is the allowed inputs to the 
leakage functions. Work of jl II, SI j assumes, following |25|> that only computa- 
tion leaks information-, this is modeled by letting each fy take as input only 
those portions of the secret key that are accessed during the ith phase of the 
scheme. Halderman et al. however, show that memory contents can be 
leaked even when they are not being accessed. Motivated (in part) by this re- 
sult, the schemes of jl Ifill ()I‘2 (tI 2| allow the {/,;} to take the entire secret key as 
input at all times. 

For the specific primitives considered in jllllldllf)ll()l2tij . the secret key sk 
is the only internal state maintained by the party holding the secret key, and 
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so allowing the {/*} to depend on sk is (almost) the most general choicell For 
signature schemes, however, any randomness used during signing might also be 
leaked to an adversary. The strongest definition of leakage resilience is thus 
obtained by allowing the {/,;} to depend on all the state information used by 
the honest signer during the course of the experiment. 

All these variants may be meaningful depending on the particular attacks one 
is trying to model. Memory attacks ran which probe long-term secret infor- 
mation during a time when computation is not taking place, can be faithfully 
modeled by allowing the leakage functions to take only sk as input. On the 
other hand, side-channel attacks that collect information while computation is 
occurring might be more accurately captured by allowing the leakage functions 
to take as input only those portions of the internal state that are being accessed. 

1.2 Our Results 

With the preceding discussion in mind, we can now describe our results in further 
detail. In all cases, we allow the leakage function(s) to be arbitrary as long as 
the total leakage is bounded as some function of the secret key length n; recall 
that such a restriction on the leakage is essential if the secret key is unchanging, 
as it is in all our schemes. Our results can be summarized as follows: 

1. We show a construction of a leakage-resilient signature scheme that is exis- 
tentially unforgeable against chosen-message attacks in the standard model, 
based on general (as opposed to number-theoretic) assumptions. This scheme 
tolerates leakage of n — n f bits of information about the secret key for any 
e > 0 based on polynomial hardness assumptions, and can tolerate (optimal) 
n — tu(log n) bits of leakage based on sub-exponential hardness assumptions. 

2. We also construct two leakage-resilient one-time (resp., t-time) signature 
schemes in the standard model. These schemes are more efficient than the 
scheme above; they also tolerate leakage that may depend on the entire state 
of the signer (rather than just the secret key). 

— Our first scheme is based on the minimal assumption that one-way func- 
tions exist, and tolerates leakage of (i — e) ■ n bits for any e > 0. The 
construction extends to give a t-tirne signature scheme tolerating leakage 
of G(n/t) bits. 

— Our second scheme, which can be based on various concrete assumptions, 
is more efficient and tolerates leakage of up to (| — e) • n bits for any 
e > 0. This construction also extends to give a f-tirne signature scheme 
tolerating leakage of Q(n/t) bits. 

In the full version of this work, we also discuss efficient constructions of full- 
fledged signature schemes based on number-theoretic assumptions (in the ran- 
dom oracle model) that are secure as long as the leakage is bounded by (| — e) • n 

2 More generally, one could also allow the {/;} to depend on the randomness used to 
generate the (public and) secret key(s); this possibility is mentioned in [2E1 Sec- 
tion 8.2]. (For the specific schemes considered in |lllli;-ill9IH)l26| . however, this 
makes no substantive difference.) 
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bits for any e > 0. Similar schemes were discovered independently by Alwen et 
al. (2J, but our analysis offers some advantages as compared to theirs. Specifi- 
cally, we make explicit the fact that the leakage can depend on the entire state 
of the signer, and we allow leakage queries to depend on the random oracle. 

Independent of our work, Faust et al. m describe a transformation from any 
3-time signature scheme tolerating a(n) bits of leakage to a full-fledged (but 
stateful) signature scheme where the secret key is updated over time; the result- 
ing scheme tolerates a(n) bits of leakage between key updates, and unbounded 
leakage overall. (In the transformed signature scheme, security is ensured as long 
as the leakage depends only on the active portion of the secret-key.) Applying 
this transformation to our constructions, we get full-fledged signature schemes 
that tolerate unbounded leakage (subject to the restrictions mentioned above). 

1.3 Overview of Our Techniques 

Our constructions all rely on the same basic idea. Roughly, we consider signature 
schemes with the following properties: 

— A given public key pk corresponds to a set S p k of exponentially many secret 
keys. Furthermore, given ( sk,pk ) with sk G S p k it remains hard to compute 
any other sk' G S p k- 

— The secret key sk used by the signer has high min-entropy (at least in a 
computational sense) even for an adversary who observes signatures on mes- 
sages of its choice. (For our one-time scheme, this is only required to hold 
for an adversary who observes a single signature.) 

— A signature forgery can be used to compute a secret key in S p k . 

To prove that any such signature scheme is leakage resilient, we show how to 
use an adversary A attacking the scheme to find distinct sk, sk' G S P k given 
(. sk,pk ) (in violation of the assumed hardness of doing so). Given ( sk,pk ), we 
simply run A on input pk and respond to its signing queries using the given 
key sk. Leakage queries can also be answered using sk. If the adversary forges 
a signature, we extract some sk' G S p k\ it remains only to show that sk' / sk 
with high probability. Let n = log S p k be the (computational) min-entropy of sk 
conditioned on pk and the signatures seen by the adversary. (We assume that all 
secret keys in S v k are equally likely, which will be the case in our constructions.) 
A standard argument (cf. Lemma [Q shows that if the leakage is bounded by l 
bits, then the conditional min-entropy of the secret key is still at least n — l — t 
bits except with probability 2 _t . So as long as the leakage is bounded away 
from n, with high probability the min-entropy of sk conditioned on A/s entire 
view is still at least 1. But then sk' sk with probability at least 1/2. This 
concludes the outline of the proof. We remark, however, that various subtleties 
arise in the formal proofs of security. 

Some existing signature schemes in the random oracle model already satisfy 
the requirements stated above. In particular, these include schemes constructed 
using the Fiat-Shamir transform m applied to a witness-indistinguishable 17- 
protocol where there are an exponential number of witnesses for to a given 
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statement. Concrete examples include the signature schemes of Okamoto m 
(extending the Schnorr j21| and Guillou-Quisquater [EE| schemes) based on the 
discrete logarithm or RSA assumptions, as well as the signature scheme of 
Fischlin and Fischlin [T2| (extending the Ong-Schnorr [2Q| scheme) based on 
the hardness of factoring. This class of schemes was also considered by Alwen et 
al. (3 • See the full version of our paper for further discussion. 

We are not aware of any existing signature scheme in the standard model 
that meets our requirements. We construct one as follows. Let H be a universal 
one-way hash function (UOWHF) mapping n-bit inputs to n c -bit outputs. 
The secret key of the signature scheme is x £ {0, 1}", and the public key is 
(y = H ( x ) ,pk,r ) where pk is a public key for a CPA-secure public-key encryption 
scheme, and r is a common reference string for an unbounded simulation-sound 
NIZK proof system |.Sdl8j . A signature on a message m consists of an encryption 
C <— Enc p k(rn\\x) of both m and x , along with a proof n that C is an encryption 
of m\\x' with H(x') = y. Observe that, with high probability over choice of x, 
there are exponentially many pre-images of y = H(x) and hence exponentially 
many valid secret keys; furthermore, finding another such secret key sk' 7 ^ sk 
requires finding a collision in H. Details are given in Section El 

Our leakage-resilient one-time signature schemes are constructed using a simi- 
lar idea. The first construction is inspired by the Lamport signature scheme m- 
The secret key is {(£i,o> 2 ;i,i)}i = i and the public key is {(y,.o ; 2 /?; , 1 ) } 1 where 
yi,b = H(xi.b) for H a UOWHF. Once again, there are exponentially many se- 
cret keys associated with any public key and finding any two such keys yields 
a collision in H. Adapting the Lamport scheme, so that the signature on a 
message m = mi • • • m*, is yields a signature scheme secure against 

leakage of n 1-e bits. By first encoding the message using an error-correcting code 
with high minimum distance, it is possible to “boost” the leakage resilience to 
(| — e) ■ n bits. Using cover-free families this approach extends also to give a 
leakage-resilient t-time signature scheme. These constructions are all described 
in Section @] 

Our second construction builds on ideas that can be traced back to urn- 
Roughly, let (G, ©) and (G',<g>) be groups with log | G' < e- log |G|, and let 
Ti = {H s : G — > G’} be a family of collision-resistant hash functions that are also 
homomorphic (i.e., for which H s (a)0H s (b) = H s (a®b))\ such hash functions can 
be constructed based on a variety of concrete assumptions (see Section E3) ■ The 
secret key is a pair of elements a,b e G, and the public-key is ( s,H s (a),H s (b )) 
for a random key s. Note, there are exponentially many secret keys associated 
with any public key and finding any two such secret keys yields a collision in H s . 
The signature on a message to G {1, . . . , ord(G)} is simply a = a® mb, which can 
be verified by checking that H s (a) = H s (a ) ® mH s (b). The important property 
for our purposes is that given a single signature a © mb, the secret key (a, b) 
still has high min-entropy. So if the adversary forges another signature o J for a 
message m! 7 ^ m, with high probability it holds that a’ 7 ^ a © m'b and we obtain 
a collision in H s . 
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2 Definitions and Preliminaries 

We provide a formal definition of leakage resilience for signature schemes, and 
state a technical lemma that will be used in our analysis. We denote the security 
parameter by k , and let ppt stand for “probabilistic polynomial time”. 

Definition 1. A signature scheme is a tuple of ppt algorithms (Gen, Sign, Vrfy) 
such that: 

— Gen is a randomized algorithm that takes as input l k and outputs ( pk,sk ), 
where pk is the public key and sk is the secret key. 

— Sign is a (possibly) randomized algorithm that takes as input the secret key 
sk, the public key pk, and a message m, and outputs a signature a. We 
denote this by a <— Sign sfc (ra), leaving the public key implicit 0 

— Vrfy is a deterministic algorithm that takes as input a public key pk, a mes- 
sage m, and a purported signature a. It outputs a bit b indicating acceptance 
or rejection, and we write this as b := Vrfy pk (m,o). 

It is required that for all k, all (pk, sk) output by Gen(l fc ), and all messages m 
in the message space, we have Vrfy pfc (m, Sign sfe (m)) = 1. 

Our definition of leakage resilience is the standard notion of existential unforge- 
ability under adaptive chosen-message attacks except that we addition- 
ally allow the adversary to specify arbitrary leakage functions {/,;} and obtain 
the value of these functions applied to the secret key (and possibly other state 
information). 

Definition 2. Let II = (Gen, Sign, Vrfy) be a signature scheme, and let X be a 
function. Given an adversary A, define the following experiment parameterized 
by k: 

1. Choose r <— {0, 1}* and compute (pk, sk) := Gen(l fc ; r). Set state := {r}. 

2. Run A(l k ,pk). The adversary may then adaptively access a signing oracle 
Sign sfc (-) and a leakage oracle Leak(-) that have the following functionality: 

— In response to the ith query Sign sfc (mj), this oracle chooses random r* <— 
{0,1}*, computes (Ji := Sign sfc (rrij; rf), and returns Ui to A. It also sets 
state := state U {r,}- 

— In response to the ith query Leak(/j) (where fi is specified as a circuit), 
this oracle gives fy (state) to A. (To make the definition meaningful in 
the random oracle model, the {fi} are allowed to be oracle circuits that 
depend on the random oracle H.) 

The {fi} can be arbitrary, subject to the restriction that the total output 
length of all the fi is at most A(|sfc|). 

3. At some point, A outputs (m, a) . 

3 Usually one assumes without loss of generality that the public key is included as part 
of the secret key. Since we measure leakage as a function of the secret-key length, 
however, we seek to minimize the size of the secret key. 
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A succeeds if (1) Vrfy pk (m,a) = 1 and (2) m was not previously queried to the 
Sign sfc (-) oracle. We denote the probability of this event by Pr[Succ^''“ kage (k)]. 
We say 77 is fully A-leakage resilient «/Pr[Succ^"'“ kase (fc)] is negligible for every 
ppt adversary A. 

If state is not updated after each signing query (and therefore, always con- 
tains only the randomness r used to generate the secret key), we denote the 
probability of success by Pr[Succ^ ^ kage (A;)] and say 77 is A-leakage resilient if 
Pr[Succ^' l “ kage (A;)] is negligible for every ppt adversary A. 

Leakage resilience in the definition above corresponds to the memory attacks of 
P (except that we allow the leakage to depend also on the random coins used 
to generate the secret key). Other variations of the definition are, of course, 
also possible: state could include only sk (and not the random coins r used to 
generate it), or could include only the most recently used random coins r\. 

2.1 A Technical Lemma 

Let A be a random variable taking values in {0, 1}". The min-entropy of A is 

77oo(A) = f mm {- log 2 Pr[A = z]}. 

The conditional min-entropy of X given an event E is defined as: 

Hoo(X | E) ^ min {- log 2 Pr[A = s | E]}. 
xefo.i}" 

Lemma 1. Let X be a random variable with 77 = f H oa (X), and fix 6 G [0, 77]. 
Let f be a function whose range has size 2 A , and set 

Y d ^ f {y e {0, 1} A | Hoa (X \ y = /(A)) <H-A}. 

Then 

Pr[/(A) GY}< 2 x ~ a . 

In words: the probability that knowledge of f(X) decreases the min-entropy of 
A by A or more is at most 2 X ~ A . Put differently, the min-entropy of A after 
observing /(A) is greater than H' except with probability at most 2 X ~ H+H . 


Proof. Fix y in the range of / and x G {0, 1}" with f(x) = y. Since 
Pr[A = x) 


Pr[A = x\y= /(A)] = 


Pr[y = /(A)] ’ 


we have that y G T only if Pr[y = /(A)] <2 A . The assumption regarding the 
range of / implies |F| < 2 A , and so Pr[/(A) Gh]< 2 X ~ A as claimed. 
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3 A Leakage-Resilient Signature Scheme 

We construct a leakage-resilient signature scheme in the standard model, fol- 
lowing the intuition described in Section [0| Let (Gen h,H) be a public-coiifl 
UOWHF j2Zj mapping n-bit inputs to \ ■ n £ -bit outputs for n = poly(fc) and 
e G (0, 1). Let (Gen B , Enc, Dec) be a CPA-secure, densc0 public-key encryption 
scheme, and let (H,V,V,Sx,$ 2 ) be an unbounded simulation-sound NIZK proof 
system jH| for the following language L: 

L = {( s,y,pk,m,C ) : 3x, u s.t. C = Enc p fc(a:;u;) and H s (x) = y} . 

The signature scheme is defined as follows: 

Key generation: Choose random x <— {0,1}” and compute s <— Gen//(l fc ). 
Obliviously sample a public key pk for the encryption scheme, and choose 
a random string r <— {0, The public key is (s,y := H s (x).pk, r) and 

the secret key is x. 

Signing: To sign message m using secret key x and public key ( s,y,pk,r ), 
first choose random u> and compute C := Enc p k{x] cj) . Then compute ir <— 

' P r {(s,y,pk,m,C),(x,u ;)); i.e., n is a proof that ( s,y,pk,m,C ) G L using 
witness (x,w). The signature is (C,n). 

Verification: Given a signature (C,7r) on the message m with respect to the 
public key ( s,y,pk,r ), output 1 iff V r ((s,y,pk, = 1. 

Theorem 1. Under the stated assumptions, the scheme above is (n—n e ) -leakage 
resilient. 

Proof (Sketch). Let II denote the scheme given above, and let A be a ppt 
adversary with S = S(k) = f Pr[Succ^" l }} kase (fc)]. We consider a sequence of ex- 
periments, and let Prj[-] denote the probability of an event in experiment i. We 
abbreviate Succ^’ l “ kase (fc) by Succ. 

Experiment 0: This is the experiment of Definition |2J Given the public key 
( s,y,pk,r ) defined by the experiment, Succ denotes the event that A outputs 
(m, (C, 7 r)) where V r ((s, y, pk, m,C), n) = 1 and m was never queried to the sign- 
ing oracle. By assumption, we have Pro [Succ] = 6. 

Experiment 1: We introduce the following differences with respect to the pre- 
ceding experiment: when setting up the public key, we now generate the common 
random string r of the simulation-sound NIZK by computing (r,r) •* — «Si(l fc ). 
Furthermore, signing queries are now answered as follows: to sign m, generate 
C <— Eoc p k{x) as before but compute 7r as 7r <— <S- 2 ((s, y,pk , m, C ), r). 


4 For a public-coin UOWHF (cf. EER it is hard to find a second pre-image even given 
the randomness used to generate the hash key. Standard constructions of UOWHFs 
have this property. 

8 This means it is possible to sample a public key “obliviously,” without knowing the 
corresponding secret key. 
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It follows from the (adaptive) zero-knowledge property of (£, V, V. Si , <S 2 ), that 
the difference |Pri[Succ] — Pro[Succ]| must be negligible. 

Experiment 2: We modify the preceding experiment in the following way: to 
answer a signing query for a message m, compute C <— Enc p fc(0") (and then 
compute 7r as in Experiment 1). CPA-security of the encryption scheme implies 
that |Pr 2 [Succ] — Pri[Succ]| is negligible. 

Experiment 3: We now change the way the public key is generated. Namely, 
instead of obliviously sampling the encryption public key pk we compute it as 
(pk, sk ) <— Gen e (l fe ) • Note that this is only a syntactic change and so Pr3 [Succ] = 
Pr 2 [Succ]. (This assumes perfect oblivious sampling; if an obliviously generated 
public key and a legitimately generated public key are only computationally 
indistinguishable, then the probability of Succ is affected by a negligible amount.) 

Given the public key (s, y, pk, r) defined by the experiment, let Ext be the event 
that A outputs (m, (C, 7r)) such that the event Succ occurs and furthermore, 
H s (Dec s k(C)) = y. Unbounded simulation soundness of the NIZK proof system 
implies that |Pr 3 [Ext] — Pr3[Succ]| is negligible. (Note that by definition of L the 
message m is included as part of the statement being proved, and so if A did 
not request a signature on m then it was never given a simulated proof of the 
statement ( s,y,pk,m,C ).) 

To complete the proof, we show that Pr3 [Ext] is negligible. Consider the fol- 
lowing adversary B finding a second preimage in the UOWHF: B chooses random 
x <— {0, 1}" and is given key s (along with the randomness used to generate s ). 
B then runs Experiment 3 with A. In this experiment all signatures given to A 
are simulated (as described in Experiment 3 above); furthermore B can easily 
answer any leakage queries made by A since B knows a legitimate secret key. 
(Recall that here we allow the leakage functions to be applied only to [the ran- 
domness used to generate] the secret key, but not to any auxiliary state used 
during signing.) If event Ext occurs when A terminates, then B recovers a value 
x' = f Dec s fc(C) for which H s {x') = y = H s (x): i.e., B recovers such an x' with 
probability exactly Prs[Ext]. We now argue that x' 7^ x with high probability. 

The only information about x revealed to A in Experiment 3 comes from the 
value y included in the public key and the leakage queries asked by A; these total 
at most \-n e +{n—n e ) = n—\-n t bits. Using LemmaQwith A = H^x) = n, the 
probability that H^x | A’s view) = 0 (i.e., the probability that x is uniquely 
determined by the view of A) is at most 2 ~ n ‘ / a , which is negligible. When the 
conditional min-entropy of x is greater than 0 there are at least two (equally 
probable) possibilities for x and so x' ^ x with probability at least Taken 
together, the probability that B recovers x' ^ x with H s (x') = H s (x) is at least 

i.(Pr 3 [Ext]-2-"'/2). 

We thus see that if Pr3[Ext] is not negligible then B violates the security of the 
UOWHF with non- negligible probability, a contradiction. □ 


Signature Schemes with Bounded Leakage Resilience 713 


If we are willing to rely on sub-exponential hardness assumptions, we can 
construct a UOWHF with a; (log n)-bit outputs. In that case, the same signature 
scheme tolerates (optimal) leakage of n — w(log n) bits. 

4 Fully Leakage- Resilient Bounded-Use Signature 
Schemes 

In this section we describe constructions of fully leakage-resilient one-time and 
t-tirne signature schemes. These results are incomparable to the result of the 
previous section: on the positive side, here we achieve full leakage resilience 
(that is, where the leakage depends not only on the secret-key, but also on the 
randomness used by the signer) as well as better efficiency (and, in one case, rely 
on weaker assumptions); on the downside, the schemes given here are only secure 
when the adversary obtains a bounded number of signatures, and the leakage 
that can be tolerated is lower. 

4.1 A Construction Based on One-Way Functions 

We describe a basic one-time signature scheme, and then present an extension 
that tolerates leakage of up to a constant fraction of the secret key length. Let 
(Gen#, H) be a UOWHF mapping /c c -bit inputs to fc-bit outputs for some c > 1. 
(As before, we assume that If is a public-coin UOWHF, i.e., it is secure even 
given the randomness used to generate the hash key.) Our basic scheme is a 
variant on Lamport’s signature scheme using H as the one-way function: 

Key generation: Choose random sc^q, 3 : 4,1 <— {0,l} fc for i = 1, and 

generate s <— GeriH(l fc ). Compute y it b := H s (xi,b) for i e {1, ...,&} and 
b e {0, 1}. The public key is (s, {yi,h}) and the secret key is {x^b}- 
Signing: The signature on a fc-bit message m = mi ■ ■ ■ m* consists of the k 
values xi >mi , . . . , Xk, mk ■ 

Verification: Given a signature x \ , . . . , Xk on the fc-bit message m = mi • • • m*, 
with respect to the public key (s, {yi,b}), output 1 iff yi, rni = H s (xi) for all i. 

It can be shown that the above scheme is fully nA c_1 "*' c+ "^-leakage resilient (as 
a one-time signature scheme), where n = 2 k c+1 denotes the length of the secret 
key. Setting c appropriately, the above approach thus tolerates leakage n 1-e 
for any desired e > 0. (We omit the proof, since we will prove security for an 
improved scheme below.) The bound on the leakage is essentially tight, since 
an adversary who obtains the signature on the message Q k and then leaks the 
value Zip (which is only k c = (n/2) c ^ c+1 ^ bits) can forge a signature on the 
message 10 fe_1 . 

Tolerating leakage linear in the secret key length. An extension of the 
above scheme allows us to tolerate greater leakage. Specifically, we apply Lam- 
port’s scheme to a high-distance encoding of the message. Details follow. 

If A is a k X f matrix over {0, 1} (viewed as the field F 2 ), then A defines a 
(linear) error-correcting code C C {0, l} e where the message m £ {0, l} k (viewed 
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as a row vector) is mapped to the codeword m ■ A. It is well known that for every 
e > 0 there exists a constant R such that choosing A G {0, l } fcx Rk uniformly 
at random defines a code with relative minimum distance \ — e, except with 
probability negligible in k. (We will not need efficient decodability.) 

Fix a constant e G (0, 1) and let R be as above; set 7 = Rk. Let (Gen#, H) 
be a UOWHF mapping £ in - bit inputs to fc-bit outputs where l lrl = 2 k/e. The 
signature scheme is defined as: 

Key generation: Choose random A G {0, l} fcx ^ and Xi t o,%i,i <— (0, l}^ in for 
i = %,...,£. Generate s <— Gen#(l fc ). Compute y i} b ■= H s (xi,b) for i G 
{1, . . . ,7} and b G {0, 1}. The public key is (A, s, {yi,b}) and the secret key 
is {Zi,6}. 

Signing: To sign a message m G {0, l} fc , first compute m = m ■ A G {0, 1 Y- 
The signature then consists of the l values x-y^x , • • • , Xf iT - nt . 

Verification: Given a signature xi,. . . ,xt on the message m with respect to 
the public key ( A , s, {yi,b}), first compute m = to ■ A and then output 1 iff 
yi,ftu = H s (xi) for all i. 

Theorem 2. If H is a UOWHF then the scheme above is a one-time signature 
scheme that is fully {\ — e) • n-leakage resilient, where n = 2£ ■ ti n denotes the 
length of the secret key. 

Proof. Let 77 denote the scheme given above, and let A be a ppt adversary 
with S = 6(k) = f Pr[Succ^’ l “ kage (fc)]. We construct an adversary B breaking 
the security of H with probability at least (5— negl(fc))/47, implying that 5 must 
be negligible. 

B chooses random A G {0, l} fcx ^ and cn £j,i <— {0, l}^ ,n for i = 1 .... T; we 
let X = {xi'b} denote the set of secret key values B chooses and observe that 
Hoo(X ) = 2 1 ■ £ ln . Next, B selects a random b* G {0,1} and a random index 
i* G {1, ... ,7}, and outputs x^^*', it is given in return a hash key s. Then B 
computes y -^b '■= H s (x it b) for all i, b and gives the public key (A, s , {yi.b}) to A. 

B answers the signing and leakage queries of A using the secret key {xi^} 
that it knows. Since this secret key is distributed identically to the secret key of 
an honest signer, the simulation for A is perfect and A outputs a forgery with 
probability 6. 

Let m denote the encoding of the message m whose signature was requested 
by A. The information A has about the secret-key X consists of: (1) the signature 
it obtained; (2) the values {2/i,i-m 4 }f = i from the public key 
and (3) the answers to the leakage queries asked by A. Together, these total 
£-£i n + £k+{\ — e)-2l-£ in bits. By Lemma GJ it follows that H OG (X \ _4’s view) > 
(| + e) • £ ■ £i n except with probability at most 

2(^in+tt+(5-2e)^ 4n )-2M in +(i+e)-M in _ 

which is negligible. 

Assuming H ao (X \ A’s view) > (| + e) • £ ■ £ ln , there is no set I C [7] with 
|/| > (77 — e) ■ 7 for which the values {x^ ,i~mi}iei are all fixed given A’s view. 
To see this, assume the contrary. Then 
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H oa (X | As view) < ^ I -4’s view) < f ^ + e) l ■ £*„, 

m ' ' 

in contradiction to the assumed bound on the conditional min-entropy of X. 

Let (m* , (xl, . . . denote the forgery output by A, and let m* = m* ■ A 
denote the encoding of m*. Let I be the set of indices where m and fh* differ; 
with all but negligible probability over choice of the matrix A it holds that 
|/| > (| — e) • £ and so we assume this to be the case. By the argument of the 
previous paragraph, it cannot be the case that the {xi^-m^iei are all fixed 
given As view. But then with probability at least half we have x* 7 ^ x for 
at least one index i £ I. Assuming this to be the case, with probability at least 
l/2£ this difference occurs at the index (**, b*) guessed at the outset by B: when 
this happens B has found a collision in H for the given hash key s. Putting 
everything together, we see that B finds a collision in H with probability at 
least (<5 — negl(fc)) • ^ as claimed. 

A f-time signature scheme. The idea above can be further extended to give 
a fully leakage resilient f-time signature scheme using cover-free families. We 
follow the definition of m 

Definition 3. A family of non-empty sets S = {Si, . . . , Sjv}, where Si C U, is 
(■ t , |)-cover-free if for all distinct S, Si, . . . , S t £ S we have |s \ (jj =1 S»| > |S|/2. 

Porat and Rothschild show an explicit construction that, for any t and k, 
yields a (t, |)-cover free family S = {Si ..... S,v} where the number of sets 
is N = <7(2 fc ), the size of each set is S, = 0(kt), and the universe size is 
\U\ = 0(kt 2 ). If we let / : {0, l} fe — > S denote an injective map, we obtain the 
following scheme: 

Key generation: Set £ = 0{kt 2 ) and £ in = 8tk. Choose Xi <— {0, 1 } lin for i = 
1 ,...,£. Generate s *— Genjj(l fc ), and compute yi~H s {xi) for ie{ 1 , . . . ,£}. 
The public key is (s, { 2 /»}<=i) and the secret key is {xi}j =1 . 

Signing: To sign a message m £ {0, l} fe , first compute /(m) = S m £ S. The 
signature then consists of {xi}i e s rn - 

Verification: Given a signature {a;*} on the message m with respect to the 
public key (s, { 2 /®}), first compute S m = /(m) and then output 1 iff y% = 
H 3 {xi) for all i £ S m . 

A proof of the following proceeds along exactly the same lines as the proof of 
Theorem 0 

Theorem 3. If H is a UOWHF then the scheme above is a t-time signature 
scheme that is fully 0(n/t) -leakage resilient, where n = £ • £*„ denotes the length 
of the secret key. 

4.2 A Construction from Homomorphic Collision-Resistant Hashing 

Our second construction of fully leakage-resilient bounded-use signature schemes 
relies on homomorphic collision-resistant hash functions, defined below. In Sec- 
tion 14.31 we describe efficient instantiations of the hash functions we need based 
on several concrete assumptions. 
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We concentrate on the case of one-time signatures, and defer a treatment of 
i-tirne signatures to the full version. 

Definition 4. Fix e 6 (0,1). A pair of ppt algorithms (Gen #, H) is an e- 
homomorphic collision-resistant hash function family (e-hCRHF) if: 

1. Genjy(l fc ) outputs a key s that specifies groups (G, ©), (G',0) (written addi- 
tively), and two sets S,T CG such that 

- log 151 = o>(log k) and log \G'\ < e ■ log |5| and log |T| < (1 + e) log |S|. 

— S is efficiently sampleable, and elements of S can be represented using 
log 1 5 1 + 0(1) bits. 

- T is efficiently recognizable, and {x + my \ x, y e S, 0 < m < 2 k } C T. 

2. The key s defines a function H s : G — > G' with H s (x 0 y) = H s (x ) 0 H s (y) 
for all x,y e G. 

3. There exists a constant c (independent of k) for which the following holds. 
For any s, any m, m! with 0 < m < m' < 2 k , and any a, o': 

| {x, y e S | H s ( x + my) = o A H s (x + m'y) = o'} | < 2 C . 

4- No ppt algorithm A can find two elements x,y gT such that H s {x) = H s (y). 
Namely, the following is negligible for all ppt A: 

Pr[s «- Gen ff (l fc ); (x, y) ^l(s) : x, y G T k A x ± y A H„{x) = H s (y)\. 

If the above holds even when A is given the randomness used to generate s, 
then (Gen h,H) is a strong e-hCRHF. 

Define a signature scheme as follows. 

Key generation: Compute s <— Genjy(l fc ); this specifies groups (G, 0), (G', 0) 
and sets S, T. Choose x, y uniformly at random from S. Output sk := (x, y) 
and pk := (s, H s (x), H s (y)). 

Signing: The scheme is defined for messages m satisfying 0 < m < 2 k . Given 
m, output the signature u := x 0 my. 

Verification: Given a signature o on the message m with respect to the public 
key pk = ( s , a, b), output 1 iff o £ T and H s (o) = a 0 mb. 

Theorem 4. If (Gen //, H) is a (strong) e-hCRHF, then the above is a one-time 
signature scheme that is (fully) (| — 2e) ■ n-leakage resilient. 

Proof. Correctness is easily verified. Let II denote the scheme given above, and 
let A be a ppt adversary with 6 = 6(k) = Pr[Succ^” l “ kage (k)). We construct 
an adversary B breaking the security of (Gen//, H) with probability at least 
5/2 — negl(fc), implying that <5 must be negligible. 

B is given as input a key s (along with the randomness used to generate 
it). B chooses x,y G S, sets sk := (x, y), and gives the public key pk := 
(s, H s (x), H s (y)) to A. Algorithm B then answers the signing and leakage queries 
of A using the secret key (x, y) that it knows. Since this secret key is distributed 
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identically to the secret key of an honest signer, the simulation for A is perfect 
and A outputs a valid forgery (m', o') with probability S. If this occurs, then B 
outputs (o' , x © m'y) as a candidate collision for H s . 

Note that x ® m'y G T. If a’ is a valid signature on m 7 , we have o’ G T and 

H s (o') = H s (x ) ® m'H s (y) = H s (x ® m'y). 

It remains to show that er 7 7 ^ x ® m'y with significant probability. 

Let c be the constant guaranteed to exist by condition 01 of Definitional The 
length of the secret key is n = f 2 log £>' bits@ The information A has about 
sk = (x, y) consists of: ( 1 ) the signature x ® my it obtained; ( 2 ) the values 
H s (x),H s (y) from the public key; and (3) the answers to the leakage queries 
asked by A. These total at most 

log |T| + 2 log \G'\ + Q- 2e) 2 log |S| < (1 + e) log |S| + 2elog |S| 

+ log \S\ ~ 4elog \S\ 

= 2 log \S\ — clog 1^1 

bits of information about sk. The min-entropy of sk is 21og|5| bits, so by 
Lemma [□ it follows that ( sk | A’s view) > c + 1 except with probability 

at most 2 -£log l‘ s 'l +c+1 , which is negligible. 

Assuming (sk | A’s view) > c+ 1, we claim that for any m! ^ m (with 
0 < ml < 2 k ) the value x ® m'y has min-entropy at least 1; this follows from 
the fact that, for any fixed o' , the two equations 0 = x ® my and o' = x ® m'y 
constrain (x, y) to a set of size at most 2 C (by conditional of Definition 0. Thus, 
o' = x® m'y with probability at most 1/2. Putting everything together, we see 
that B finds a collision in H s with probability at least ( 8 — negl(fc)) ■ \ as claimed. 

4.3 Constructing (Strong) Homomorphic CRHFs 

Homomorphic CRHFs can be constructed from a variety of standard assump- 
tions. Here, we describe constructions based on the discrete logarithm and the 
RSA assumptions; in the full version, we show a construction based on lattices. 
All except the RSA-based construction are strong e-hCRHFs. 

An instantiation based on the discrete logarithm assumption. Let G' 
be a group of prime order p > 2 k where the discrete logarithm problem is hard. 
Let £ = rf|, and set S = T = G = Z e p . 

The key-generation algorithm Gen// outputs random gi,..., gi G G as the 
key. Given s = (gi,. . . ,gt), define H s [x 1 , . . . , Xf) = ]/[,= , fit - This function is 
clearly homomorphic, and collision resistance follows by standard arguments. 

An instantiation based on the RSA assumption. Fix 1= [|]. On security 
parameter k, algorithm Gen//(l fc ) chooses safe primes p = 2p' + 1 and q = 2q' + 1 


We assume for simplicity that elements of S can be described using exactly log |S| 
bits; the proof can be modified suitably if this is not the case. 
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with p',q' > 2 fc , and sets N = pq. (The primes p and q are not used after 
key generation, but because they are in memory during key generation this 
construction is not strong.) Gen# then chooses a random element u £ Z* v , as 
well as a prime e > 2^ +1 )' fc . The key is s = ( N , e. u). 

Let G = h* N x Z and G' = T,* N . Define 

H s (r, x) = r e ■ u x mod N. 

Take S = QJZn x {0. .... 2 tk } c G (where QJZn denotes the set of quadratic 
residues modulo N) and T = h* N x {0, . . . , 2^ +1 )' fe }. 

The homomorphic property of H s is easy to see. One can also verify that: 

1. log |5| = u;(log k) and log \G'\ < e ■ log jS 1 ! and log \T\ < (1 + e) log |5|. 

2. T is efficiently recognizable, and {x + my x, y £ S, 0 < m < 2 k } C T. 

3. For any s, any m, m' with 0 < rn < rn' < 2 fc , and any a, a': 

| {x, y e S | H s (x + my) = a A H s (x + m'y) = cr'} | < 1. 

(This uses the fact that QIZn — Z p / x Z g / has no elements other than the 
identity whose order is less than 2 k .) 

Collision resistance follows via standard arguments (e.g., mi). 

Acknowledgments 

We are grateful to Yael Tauman Kalai for stimulating discussions, and collabo- 
ration during the early stages of this work. We also thank Krzysztof Pietrzak for 
suggesting the extension to the one-time signature scheme (based on one-way 
functions) that tolerates leakage linear in the secret key length. 

References 

1. Akavia, A., Goldwasser, S., Vaikuntanathan, V.: Simultaneous hardcore bits and 
cryptography against memory attacks. In: Reingold, O. (ed.) TCC 2009. LNCS, 
vol. 5444, pp. 474-495. Springer, Heidelberg (2009) 

2. Alwen, J., Dodis, Y., Wichs, D.: Public key cryptography in the bounded retrieval 
model and security against side-channel attacks. In: Halevi, S. (ed.) Crypto 2009. 
LNCS, vol. 5677, pp. 1-17. Springer, Heidelberg (2009) 

3. Biham, E., Carmeli, Y., Shamir, A.: Bug attacks. In: Wagner, D. (ed.) CRYPTO 
2008. LNCS, vol. 5157, pp. 221-240. Springer, Heidelberg (2008) 

4. Boneh, D., Brumley, D.: Remote timing attacks are practical. Computer Net- 
works 48(5), 701-716 (2005) 

5. Boneh, D., DeMillo, R.A., Lipton, R.J.: On the importance of checking cryp- 
tographic protocols for faults. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, 
vol. 1233, pp. 37-51. Springer, Heidelberg (1997) 

6. Canetti, R., Dodis, Y., Halevi, S., Kushilevitz, E., Sahai, A.: Exposure-resilient 
functions and all-or-nothing transforms. In: Preneel, B. (ed.) EUROCRYPT 2000. 
LNCS, vol. 1807, pp. 453-469. Springer, Heidelberg (2000) 


Signature Schemes with Bounded Leakage Resilience 719 


7. Cramer, R., Damgard, I.: Secure signature schemes based on interactive protocols. 
In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 297-310. Springer, 
Heidelberg (1995) 

8. De Santis, A., Di Crescenzo, G., Ostrovsky, R., Persiano, G., Sahai, A.: Robust non- 
interactive zero knowledge. In: Kilian, J. (ed.) CRYPTO 2001. LNCS, vol. 2139, 
pp. 566-598. Springer, Heidelberg (2001) 

9. Dodis, Y., Kalai, Y., Lovett, S.: On cryptography with auxiliary input. In: 41st 
Annual ACM Symposium on Theory of Computing (STOC), pp. 621-630. ACM, 
New York (2009) 

10. Dodis, Y., Kalai, Y., Vaikuntanathan, V.: Public-key encryption schemes with 
auxiliary inputs (manuscript, 2009) 

11. Dziembowski, S., Pietrzak, K.: Leakage-resilient cryptography. In: 49th Annual 
Symposium on Foundations of Computer Science (FOCS), pp. 293-302. IEEE, Los 
Alamitos (2008), Full version: http://eprint.iacr.org/2008/240 

12. Faust, S., Kiltz, E., Pietrzak, K., Rothblum, G.: Leakage-resilient signatures, 
http : // eprint . iacr . org/2009/282 

13. Fiat, A., Shamir, A.: How to prove yourself: Practical solutions to identification 
and signature problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, 
pp. 186-194. Springer, Heidelberg (1987) 

14. Fischlin, M., Fischlin, R.: The representation problem based on factoring. In: Pre- 
neel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 96-113. Springer, Heidelberg 
(2002) 

15. Goldwasser, S., Micali, S., Rivest, R.L.: A digital signature scheme secure against 
adaptive chosen-message attacks. SIAM Journal on Computing 17(2), 281-308 
(1988) 

16. Guillou, L.C., Quisquater, J.-J.: A “paradoxical” indentity-based signature scheme 
resulting from zero-knowledge. In: Goldwasser, S. (ed.) CRYPTO 1988. LNCS, 
vol. 403, pp. 216-231. Springer, Heidelberg (1990) 

17. Halderman, A., Schoen, S., Heninger, N., Clarkson, W., Paul, W., Calandrino, J., 
Feldman, A., Applebaum, J., Felten, E.: Lest we remember: Cold boot attacks on 
encryption keys. In: Proc. 17th USENIX Security Symposium, pp. 45-60. USENIX 
Association (2008) 

18. Hsiao, C.-Y., Reyzin, L.: Finding collisions on a public road, or do secure hash 
functions need secret coins? In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, 
pp. 92-105. Springer, Heidelberg (2004) 

19. Ishai, Y., Sahai, A., Wagner, D.: Private circuits: Securing hardware against prob- 
ing attacks. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 463-481. 
Springer, Heidelberg (2003) 

20. Kocher, P.C.: Timing attacks on implementations of DifEe-Hellman, RSA, DSS, 
and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 
104-113. Springer, Heidelberg (1996) 

21. Kocher, P.C., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) 
CRYPTO 1999. LNCS, vol. 1666, pp. 388-397. Springer, Heidelberg (1999) 

22. Kumar, R., Rajagopalan, S., Sahai, A.: Coding constructions for blacklisting prob- 
lems without computational assumptions. In: Wiener, M. (ed.) CRYPTO 1999. 
LNCS, vol. 1666, pp. 609-623. Springer, Heidelberg (1999) 

23. Lamport, L.: Constructing digital signatures from a one-way function. Technical 
Report SRI-CSL-98, SRI International Computer Science Laboratory (October 
1979) 


720 J. Katz and V. Vaikuntanathan 


24. Lyubashevsky, V., Micciancio, D.: Asymptotically efficient lattice-based digital sig- 
natures. In: Canetti, R. (ed.) TCC 2008. LNCS, vol. 4948, pp. 37-54. Springer, 
Heidelberg (2008) 

25. Micali, S., Reyzin, L.: Physically observable cryptography. In: Naor, M. (ed.) TCC 
2004. LNCS, vol. 2951, pp. 278-296. Springer, Heidelberg (2004) 

26. Naor, M., Segev, G.: Public-key cryptosystems resilient to key leakage. In: Halevi, 
S. (ed.) CRYPTO 2009. LNCS, vol. 5677, pp. 18-35. Springer, Heidelberg (2009), 
http : / / eprint . iacr . org/2009/105 

27. Naor, M., Yung, M.: Universal one-way hash functions and their cryptographic 
applications. In: 21st Annual ACM Symposium on Theory of Computing (STOC), 
pp. 33-43. ACM Press, New York (1989) 

28. Nguyen, P.Q., Shparlinski, I.: The insecurity of the digital signature algorithm with 
partially known nonces. Journal of Cryptology 15(3), 151-176 (2002) 

29. Okamoto, T.: Provably secure and practical identification schemes and correspond- 
ing signature schemes. In: Brickell, E.F. (ed.) CRYPTO 1992. LNCS, vol. 740, pp. 
31-53. Springer, Heidelberg (1993) 

30. Ong, H., Schnorr, C.-P.: Feist signature generation with a Fiat-Shamir-like scheme. 
In: Damgard, I.B. (ed.) EUROCRYPT 1990. LNCS, vol. 473, pp. 432-440. Springer, 
Heidelberg (1991) 

31. Pietrzak, K.: A leakage-resilient mode of operation. In: Joux, A. (ed.) 
EUROCRYPT 2009. LNCS, vol. 5479, pp. 462-482. Springer, Heidelberg (2009) 

32. Porat, E., Rothschild, A.: Explicit non-adaptive combinatorial group test- 
ing schemes. In: Aceto, L., Damgard, I., Goldberg, L.A., Halldorsson, M.M., 
Ingolfsdottir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, 
pp. 748-759. Springer, Heidelberg (2008) 

33. Sahai, A.: Non-malleable non-interactive zero knowledge and adaptive chosen- 
ciphertext security. In: 40th Annual Symposium on Foundations of Computer Sci- 
ence (FOCS), pp. 543-553. IEEE, Los Alamitos (1999) 

34. Schnorr, C.-P.: Efficient identification and signatures for smart cards. In: Brassard, 
G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 239-252. Springer, Heidelberg (1990) 

35. Standaert, F.-X., Malkin, T., Yung, M.: A unified framework for the analysis of 
side-channel key recovery attacks. In: Joux, A. (ed.) EUROCRYPT 2009. LNCS, 
vol. 5479, pp. 443-461. Springer, Heidelberg (2009) 


Author Index 


Abe, Masayuki 435 
Aoki, Kazumaro 578 
Armknecht, Frederik 685 
Ateniese, Giuseppe 319 
Aumasson, Jean-Philippe 542 

Bellare, Mihir 232 
Benadjila, Ryad 162 
Billet, Olivier 162, 451 
Biryukov, Alex 1 
Boldyreva, Alexandra 524 
Brakerski, Zvika 232 
Brier, Eric 560 
Brumley, Billy Bob 667 

Qalik, Qagda§ 542 
Cao, Zhenfu 303 

Cash, David 524 

Castagnos, Guilhem 469 
Cathalo, Julien 179 
Chen, Liqun 505 
Choi, Seung Geol 268, 287 
Coron, Jean-Sebastien 653 

Dachman-Soled, Dana 287 
Damgard, Ivan 52 
Ding, Ning 303 

Elbaz, Ariel 268 

Finiasz, Matthieu 88 
Fischlin, Marc 524 

Gazi, Peter 37 
Gueron, Shay 162 
Guo, Jian 578 

Hakala, Risto M. 667 
Herrmann, Mathias 487 

Jager, Tibor 399 
Joux, Antoine 347, 469 

Kamara, Seny 319 

Katz, Jonathan 197, 319, 636, 703 


Khazaei, Shahram 560 
Khovratovich, Dmitry 1 
Kurosawa, Kaoru 334 

Laguillaumie, Fabien 469 
Lai, Xuejia 19 
Lamberger, Mario 126 
Lehmann, Anja 364 
Libert, Benoit 179 
Lucks, Stefan 347 
Lunemann, Carolin 52 
Lyubashevsky, Vadim 598 

Ma, Rong 303 
Macario-Rat, Gilles 451 
Maes, Roel 685 
Malkin, Tal 268, 287 
Mandal, Avradip 653 
Matusiewicz, Krystian 106, 578 
Maurer, Ueli 37 
May, Alexander 487 
Meier, Willi 542, 560 
Mendel, Florian 126, 144 
Morrissey, Paul 505 

Naito, Yusuke 382 
Naor, Moni 232 
Naya-Plasencia, Marfa 106 
Nguyen, Phong Q. 469 
Nikolic, Ivica 106 
Nojima, Ryo 334 

Ohkubo, Miyako 435 
Ohta, Kazuo 382 
Okamoto, Tatsuaki 214 
Ozen, Onur 542 

Peyrin, Thomas 560 
Phan, Raphael C.-W. 542 

Pinkas, Benny 250 

Rechberger, Christian 126, 144 
Rijmen, Vincent 126 
Ristenpart, Thomas 232 
Robshaw, Matt J.B. 162 


722 


Author Index 


Sadeghi, Ahmad-Reza 685 
Salvail, Louis 70 
Sasaki, Yu 106, 578 
Schaffner, Christian 70 
Schlaffer, Martin 106, 126, 144 
Schneider, Thomas 250 
Schwenk, Jorg 399 
Segev, Gil 232 
Sendrier, Nicolas 88 
Shacham, Hovav 232 
Smart, Nigel P. 250, 505 
Sotakova, Miroslava 70 
Stehle, Damien 617 
Steinfeld, Ron 617 
Sun, Xiaorui 19 
Sunar, Berk 685 

Takashima, Katsuyuki 214 
Tanaka, Keisuke 617 


Tessaro, Stefano 364 
Tuyls, Pim 685 

Vaikuntanathan, Vinod 636, 703 
Vanci, Kerem 542 

Wang, Lei 382, 578 
Warinschi, Bogdan 505, 524 
Wee, Hoeteck 287, 417 
Williams, Stephen C. 250 

Xagawa, Keita 617 

Yerukhimovich, Arkady 197 
Yilek, Scott 232 
Yoneyama, Kazuki 382 
Yung, Moti 179, 268 

Zhang, Zongyang 303 


